[SOLVED] Is it "proper" IOMMU group isolation

NinthWave

Member
Sep 27, 2021
37
0
11
44
Montreal, CANADA
I am not sure I fully grasp what is proper IOMMU isolation.

1707879320736.png

iommugroup =1 device_name=GP107GL [Quadro P600] seems OK
iommugroup=14 device_name=SAA7164 [Hauppauge tuner] seems OK
iommugroup=15 device_name=Coral Edge TPU seems OK

iommugroup=1 device_name=Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller
iommugroup=1 device_name=Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller

So:
The Quadro is in x16 slot
If any other two cards is in x8 slot, my understanding is that I do not have proper isolation because both card are driven by 2 controllers that are both in iommugroup=1

Am I right ?
Or is it just important that all cards are in different "PCI Express Root Port" ?

Thanks


Code:
root@proxmox:~# pvesh get /nodes/proxmox/hardware/pci --pci-class-blacklist ""
┌──────────┬────────┬──────────────┬────────────┬────────┬──────────────────────────────────────────────────────────────────────────┬──────┬──────────────────
│ class    │ device │ id           │ iommugroup │ vendor │ device_name                                                              │ mdev │ subsystem_device
╞══════════╪════════╪══════════════╪════════════╪════════╪══════════════════════════════════════════════════════════════════════════╪══════╪══════════════════
│ 0x010601 │ 0x8c02 │ 0000:00:1f.2 │         10 │ 0x8086 │ 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x020000 │ 0x1533 │ 0000:04:00.0 │         12 │ 0x8086 │ I210 Gigabit Network Connection                                          │      │ 0x35b6
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x020000 │ 0x1533 │ 0000:05:00.0 │         13 │ 0x8086 │ I210 Gigabit Network Connection                                          │      │ 0x35b6
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x030000 │ 0x1cb2 │ 0000:01:00.0 │          1 │ 0x10de │ GP107GL [Quadro P600]                                                    │      │ 0x11bd
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x030000 │ 0x0522 │ 0000:03:00.0 │         11 │ 0x102b │ MGA G200e [Pilot] ServerEngines (SEP1)                                   │      │ 0x0103
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x040300 │ 0x0fb9 │ 0000:01:00.1 │          1 │ 0x10de │ GP107GL High Definition Audio Controller                                 │      │ 0x11bd
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x048000 │ 0x7164 │ 0000:06:00.0 │         14 │ 0x1131 │ SAA7164                                                                  │      │ 0x8851
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060000 │ 0x0c08 │ 0000:00:00.0 │          0 │ 0x8086 │ Xeon E3-1200 v3 Processor DRAM Controller                                │      │ 0x2010
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060100 │ 0x8c52 │ 0000:00:1f.0 │         10 │ 0x8086 │ C222 Series Chipset Family Server Essential SKU LPC Controller           │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x0c01 │ 0000:00:01.0 │          1 │ 0x8086 │ Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller        │      │ 0x2010
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x0c05 │ 0000:00:01.1 │          1 │ 0x8086 │ Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller         │      │ 0x2010
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x8c10 │ 0000:00:1c.0 │          4 │ 0x8086 │ 8 Series/C220 Series Chipset Family PCI Express Root Port #1             │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x8c12 │ 0000:00:1c.1 │          5 │ 0x8086 │ 8 Series/C220 Series Chipset Family PCI Express Root Port #2             │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x8c14 │ 0000:00:1c.2 │          6 │ 0x8086 │ 8 Series/C220 Series Chipset Family PCI Express Root Port #3             │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x8c16 │ 0000:00:1c.3 │          7 │ 0x8086 │ 8 Series/C220 Series Chipset Family PCI Express Root Port #4             │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x060400 │ 0x8c18 │ 0000:00:1c.4 │          8 │ 0x8086 │ 8 Series/C220 Series Chipset Family PCI Express Root Port #5             │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x0880ff │ 0x089a │ 0000:07:00.0 │         15 │ 0x1ac1 │ Coral Edge TPU                                                           │      │ 0x089a
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x0c0320 │ 0x8c2d │ 0000:00:1a.0 │          3 │ 0x8086 │ 8 Series/C220 Series Chipset Family USB EHCI #2                          │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x0c0320 │ 0x8c26 │ 0000:00:1d.0 │          9 │ 0x8086 │ 8 Series/C220 Series Chipset Family USB EHCI #1                          │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x0c0330 │ 0x8c31 │ 0000:00:14.0 │          2 │ 0x8086 │ 8 Series/C220 Series Chipset Family USB xHCI                             │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x0c0500 │ 0x8c22 │ 0000:00:1f.3 │         10 │ 0x8086 │ 8 Series/C220 Series Chipset Family SMBus Controller                     │      │ 0x35b7
├──────────┼────────┼──────────────┼────────────┼────────┼──────────────────────────────────────────────────────────────────────────┼──────┼──────────────────
│ 0x118000 │ 0x8c24 │ 0000:00:1f.6 │         10 │ 0x8086 │ 8 Series Chipset Family Thermal Management Controller                    │      │ 0x35b7
└──────────┴────────┴──────────────┴────────────┴────────┴──────────────────────────────────────────────────────────────────────────┴──────┴──────────────────
 
Devices in the same IOMMU group can only be passed to the same VM (or all to the Proxmox host). PCIe Bridges and Root Ports don't matter (they are part of the motherboard or the device itself and only route data) . Not all devices of a group need to be passed to the same VM but those device will not be accessible by the Proxmox host or other VMs (for security isolation).
Group 1 looks fine (only functions from the GPU device and some PCIe layout). Group 14 also looks fine (only a device and probably some PCIe Bridge that is not shown by Proxmox) and similar for group 15.
As long as you don't use pcie_acs_override (check with cat /proc/cmdline), the devices you want to passthrough look properly isolated. They won't be able to communicate (via DMA without the CPU or IOMMU noticing) and steal information between VMs and/or the Proxmox host, which is what the IOMMU groups protect against.
 
  • Like
Reactions: NinthWave
Devices in the same IOMMU group can only be passed to the same VM (or all to the Proxmox host). PCIe Bridges and Root Ports don't matter (they are part of the motherboard or the device itself and only route data) . Not all devices of a group need to be passed to the same VM but those device will not be accessible by the Proxmox host or other VMs (for security isolation).
Group 1 looks fine (only functions from the GPU device and some PCIe layout). Group 14 also looks fine (only a device and probably some PCIe Bridge that is not shown by Proxmox) and similar for group 15.
As long as you don't use pcie_acs_override (check with cat /proc/cmdline), the devices you want to passthrough look properly isolated. They won't be able to communicate (via DMA without the CPU or IOMMU noticing) and steal information between VMs and/or the Proxmox host, which is what the IOMMU groups protect against.
Is it normal that the id changed from the host to the VM?

As far as I can remember (which can be wrong), in my previous install, the ids kept the same number but now theyr are totally different.
I am asking that because I can't successfully install NVIDIA driver either from .run or APT cause I receive a message:
"NVIDIA-SMI can't communicate withe driver... Please verify the driver is either loaded or installed"
 
Is it normal that the id changed from the host to the VM?
Yes because the VM has a virtual PCI(e) bus, with different PCI IDs.
As far as I can remember (which can be wrong), in my previous install, the ids kept the same number but now theyr are totally different.
That must have been a coincidence. It is possible because the virtual PCI IDs of hostpci0 uip to hostpci4 are fixed. And it could happen to be the same as the motherboard PCI ID.
I am asking that because I can't successfully install NVIDIA driver either from .run or APT cause I receive a message:
"NVIDIA-SMI can't communicate withe driver... Please verify the driver is either loaded or installed"
Does your GPU reset properly? Do you maybe need all kinds of special NVidia work-arounds and GPU BIOS patching?
I really don't know and hopefully someone else here knows. Lots of threads about NVidia GPU passthrough on this forum also.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!