[SOLVED] Dedicated server : iGPU passthrough no IOMMU at all in /sys/kernel/iommu_groups

Orphee

Member
Mar 8, 2022
8
2
8
42
Hello,

I have a dedicated server online.

It has a Xeon E3-1245 v2.

I installed Proxmox using their image template, everything works exept PCI_passthrough.

I also have a personal server at home with Proxmox to, on this one I'm able to passthrough my 2 nvidia cards and my iGPU.

So I usually know how to make passthrough works.

But on this online dedicated server, no matter what I try, /sys/kernel/iommu_groups stays definitively empty...

Code:
~# lspci -s 00:02.0 -nnkkvq
00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller [8086:016a] (rev 09) (prog-if 00 [VGA controller])
        DeviceName: Intel(R) HD Graphics Device
        Subsystem: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller [8086:2002]    
        Flags: bus master, fast devsel, latency 0, IRQ 11
        Memory at fe000000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at f000 [size=64]
        Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
        Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [a4] PCI Advanced Features
        Kernel modules: i915

~# ls -l /sys/kernel/iommu_groups/
total 0

~# dmesg |egrep -i "dmar|iommu|x2apic"
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.15.104-1-pve root=UUID=8a1c83b5-ab35-41d3-85af-aefe9389383f ro quiet intel_iommu=on iommu=pt
[    0.067146] Kernel command line: BOOT_IMAGE=/vmlinuz-5.15.104-1-pve root=UUID=8a1c83b5-ab35-41d3-85af-aefe9389383f ro quiet intel_iommu=on iommu=pt
[    0.067210] DMAR: IOMMU enabled
[    0.181104] x2apic: IRQ remapping doesn't support X2APIC mode
[    0.265449] iommu: Default domain type: Passthrough (set via kernel command line)

I know VT-x is enabled and working :

Code:
root@pve:~# kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used
root@pve:~# lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   36 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           58
Model name:                      Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
Stepping:                        9
CPU MHz:                         3800.000
CPU max MHz:                     3800.0000
CPU min MHz:                     1600.0000
BogoMIPS:                        6784.77
Virtualization:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        8 MiB
NUMA node0 CPU(s):               0-7
Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Unknown: No mitigations
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling,                                  PBRSB-eIBRS Not affected
Vulnerability Srbds:             Vulnerable: No microcode
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dt                                 s acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfm                                 on pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dte                                 s64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
                                 popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd i                                 brs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt d                                 therm ida arat pln pts flush_l1d

but comparing with my personal server :
Code:
~# dmesg |egrep -i "dmar|iommu"
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.104-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init
[    0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[    0.007971] ACPI: DMAR 0x000000003B2401D8 0000C8 (v01 INTEL  EDK2     00000002      01000013)
[    0.008007] ACPI: Reserving DMAR table memory at [mem 0x3b2401d8-0x3b24029f]
[    0.077941] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.104-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init
[    0.077992] DMAR: IOMMU enabled
[    0.221227] DMAR: Host address width 39
[    0.221228] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.221232] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
[    0.221234] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.221236] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.221238] DMAR: RMRR base: 0x0000003b699000 end: 0x0000003b8e2fff
[    0.221239] DMAR: RMRR base: 0x0000003d000000 end: 0x0000003f7fffff
[    0.221240] DMAR: RMRR base: 0x0000003acde000 end: 0x0000003ad5dfff
[    0.221242] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.221243] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.221244] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.224370] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.529910] iommu: Default domain type: Passthrough (set via kernel command line)
[    0.640955] DMAR: No ATSR found
[    0.640956] DMAR: No SATC found
[    0.640957] DMAR: IOMMU feature fl1gp_support inconsistent
[    0.640958] DMAR: IOMMU feature pgsel_inv inconsistent
[    0.640958] DMAR: IOMMU feature nwfs inconsistent
[    0.640959] DMAR: IOMMU feature pasid inconsistent
[    0.640960] DMAR: IOMMU feature eafs inconsistent
[    0.640960] DMAR: IOMMU feature prs inconsistent
[    0.640961] DMAR: IOMMU feature nest inconsistent
[    0.640962] DMAR: IOMMU feature mts inconsistent
[    0.640962] DMAR: IOMMU feature sc_support inconsistent
[    0.640963] DMAR: IOMMU feature dev_iotlb_support inconsistent
[    0.640964] DMAR: dmar0: Using Queued invalidation
[    0.640966] DMAR: dmar1: Using Queued invalidation
[    0.641215] pci 0000:00:00.0: Adding to iommu group 0
[    0.641226] pci 0000:00:01.0: Adding to iommu group 1
[    0.641235] pci 0000:00:01.1: Adding to iommu group 2
[    0.641243] pci 0000:00:02.0: Adding to iommu group 3
[    0.641251] pci 0000:00:08.0: Adding to iommu group 4
[    0.641264] pci 0000:00:12.0: Adding to iommu group 5
[    0.641278] pci 0000:00:14.0: Adding to iommu group 6
[    0.641285] pci 0000:00:14.2: Adding to iommu group 6
[    0.641299] pci 0000:00:15.0: Adding to iommu group 7
[    0.641307] pci 0000:00:15.1: Adding to iommu group 7
[    0.641318] pci 0000:00:16.0: Adding to iommu group 8
[    0.641326] pci 0000:00:17.0: Adding to iommu group 9
[    0.641350] pci 0000:00:1b.0: Adding to iommu group 10
[    0.641372] pci 0000:00:1b.4: Adding to iommu group 11
[    0.641392] pci 0000:00:1c.0: Adding to iommu group 12
[    0.641410] pci 0000:00:1c.2: Adding to iommu group 13
[    0.641425] pci 0000:00:1c.5: Adding to iommu group 14
[    0.641446] pci 0000:00:1c.6: Adding to iommu group 15
[    0.641466] pci 0000:00:1c.7: Adding to iommu group 16
[    0.641477] pci 0000:00:1e.0: Adding to iommu group 17
[    0.641501] pci 0000:00:1f.0: Adding to iommu group 18
[    0.641509] pci 0000:00:1f.3: Adding to iommu group 18
[    0.641518] pci 0000:00:1f.4: Adding to iommu group 18
[    0.641526] pci 0000:00:1f.5: Adding to iommu group 18
[    0.641535] pci 0000:00:1f.6: Adding to iommu group 18
[    0.641548] pci 0000:01:00.0: Adding to iommu group 19
[    0.641558] pci 0000:01:00.1: Adding to iommu group 20
[    0.641569] pci 0000:02:00.0: Adding to iommu group 21
[    0.641579] pci 0000:02:00.1: Adding to iommu group 22
[    0.641600] pci 0000:04:00.0: Adding to iommu group 23
[    0.641619] pci 0000:06:00.0: Adding to iommu group 24
[    0.641642] pci 0000:07:00.0: Adding to iommu group 25
[    0.641662] pci 0000:08:00.0: Adding to iommu group 26
[    0.641665] pci 0000:09:00.0: Adding to iommu group 26
[    0.641684] pci 0000:0a:00.0: Adding to iommu group 27
[    0.641798] DMAR: Intel(R) Virtualization Technology for Directed I/O

Where everything is OK, I don't understand why it does not work on the dedicated server.

I requested the provider to look at BIOS configuration and they confirmed VT-d is enabled in the BIOS... (no choice but trust them on this, can't check myself)

Do I miss something ?

Code:
:/etc/modprobe.d# cat blacklist.conf
blacklist pcspkr
blacklist floppy
#blacklist snd_hda_intel
blacklist snd_hda_codec_hdmi
blacklist i915
install i915 /usr/bin/false
install intel_agp /usr/bin/false

:/etc/modprobe.d# cat iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1

:/etc/modprobe.d# cat kvm.conf
options kvm ignore_msrs=1 report_ignored_msrs=0

:/etc/modprobe.d# cat mdadm.conf
# mdadm module configuration file
# set start_ro=1 to make newly assembled arrays read-only initially,
# to prevent metadata writes.  This is needed in order to allow
# resume-from-disk to work - new boot should not perform writes
# because it will be done behind the back of the system being
# resumed.  See http://bugs.debian.org/415441 for details.

options md_mod start_ro=1

:/etc/modprobe.d# cat pve-blacklist.conf
# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

:/etc/modprobe.d# cat vfio.conf
options vfio-pci ids=8086:016a disable_vga=1

:# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

vfio
vfio_iommu_type1
vfio-mdev
vfio_pci
vfio_virqfd

I blacklisted i915 driver

Motherboard info :
Code:
~# dmidecode -t 2
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: Intel Corporation
        Product Name: DH67BL
        Version: AAG10189-213
        Serial Number: ------------
        Asset Tag: To be filled by O.E.M.
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: To be filled by O.E.M.
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0

~# efibootmgr
EFI variables are not supported on this system.

Thank you.

Edit :
Code:
:~# virt-host-validate
  QEMU: Checking for hardware virtualization                                 : PASS
  QEMU: Checking if device /dev/kvm exists                                   : PASS
  QEMU: Checking if device /dev/kvm is accessible                            : PASS
  QEMU: Checking if device /dev/vhost-net exists                             : PASS
  QEMU: Checking if device /dev/net/tun exists                               : PASS
  QEMU: Checking for cgroup 'cpu' controller support                         : PASS
  QEMU: Checking for cgroup 'cpuacct' controller support                     : PASS
  QEMU: Checking for cgroup 'cpuset' controller support                      : PASS
  QEMU: Checking for cgroup 'memory' controller support                      : PASS
  QEMU: Checking for cgroup 'devices' controller support                     : PASS
  QEMU: Checking for cgroup 'blkio' controller support                       : PASS
  QEMU: Checking for device assignment IOMMU support                         : WARN (No ACPI DMAR table found, IOMMU either disabled in BIOS or not supported by this hardware platform)
  QEMU: Checking for secure guest support                                    : WARN (Unknown if this platform has Secure Guest support)
   LXC: Checking for Linux >= 2.6.26                                         : PASS
   LXC: Checking for namespace ipc                                           : PASS
   LXC: Checking for namespace mnt                                           : PASS
   LXC: Checking for namespace pid                                           : PASS
   LXC: Checking for namespace uts                                           : PASS
   LXC: Checking for namespace net                                           : PASS
   LXC: Checking for namespace user                                          : PASS
   LXC: Checking for cgroup 'cpu' controller support                         : PASS
   LXC: Checking for cgroup 'cpuacct' controller support                     : PASS
   LXC: Checking for cgroup 'cpuset' controller support                      : PASS
   LXC: Checking for cgroup 'memory' controller support                      : PASS
   LXC: Checking for cgroup 'devices' controller support                     : PASS
   LXC: Checking for cgroup 'freezer' controller support                     : FAIL (Enable 'freezer' in kernel Kconfig file or mount/enable cgroup controller in your system)
   LXC: Checking for cgroup 'blkio' controller support                       : PASS
   LXC: Checking if device /sys/fs/fuse/connections exists                   : PASS
 
Last edited:
I know VT-x is enabled and working :

Code:
root@pve:~# kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used
root@pve:~# lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   36 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           58
Model name:                      Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
Stepping:                        9
CPU MHz:                         3800.000
CPU max MHz:                     3800.0000
CPU min MHz:                     1600.0000
BogoMIPS:                        6784.77
Virtualization:                  VT-x
L1d cache:                       128 KiB
L1i cache:                       128 KiB
L2 cache:                        1 MiB
L3 cache:                        8 MiB
NUMA node0 CPU(s):               0-7
Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Mmio stale data:   Unknown: No mitigations
Vulnerability Retbleed:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling,                                  PBRSB-eIBRS Not affected
Vulnerability Srbds:             Vulnerable: No microcode
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dt                                 s acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfm                                 on pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dte                                 s64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
                                 popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd i                                 brs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt d                                 therm ida arat pln pts flush_l1d
VT-x is for running VMs. For IOMMU and passthrough VT-d needs to be enabled (and supported by CPU and motherboard).
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.104-1-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init
I don't know how you can check this, but since intel_iommu=on is active, it must not be enabled (or supported) in the motherboard BIOS.
 
I don't know how you can check this, but since intel_iommu=on is active, it must not be enabled (or supported) in the motherboard BIOS.
Support told me VT-d is well enabled in BIOS... I'm not able to confirm/check myself... but this was my first bet.

We planned a maintenance window, they checked BIOS and confirme VT-d was enabled... So I don't know...
 
Support told me VT-d is well enabled in BIOS... I'm not able to confirm/check myself... but this was my first bet.
There are some integrated graphics, e.g. Broadwell, the are explicitly excluded from IOMMU but if you are not getting any IOMMU groups then it is not enabled. It's no uncommon for people (on this forum) to check mulitple times and eventually it turns out be disabled in BIOS by mistake.
 
Last edited:
  • Like
Reactions: Orphee
There are some integrated graphics, e.g. Broadwell, the are explicitly excluded from IOMMU but if you are not getting any IOMMU groups then it is not enabled. It's no uncommon for people (on this forum) to check mulitple times and eventually it turns out be disabled in BIOS by mistake.
I asked support team to re-check.

And finally they wronlgy stated about VT-d vs VT-x ...

They told me DH67BL (chipset H67) does not support VT-d...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!