Help with confusion about when/why to use vIOMMU?

Sep 1, 2022
475
180
48
41
Hello,

I kind of ignored this feature when it was first announced. I assumed it was just for nested virtualization, and it does play a role in that.

But on further reading, it sounds like since it impacts the memory mapping of VFIO devices, it has performance impacts for real PCIe devices passed through to a VM? Is that right?

I know that performance improvements can't be gauranteed, but adding some examples of potential performance boosts to the docs, or more broadly a section on when/why/what use cases you'd want to enable it might be useful.
See: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_pci_viommu

Also, the docs seem to suggest that Intel's vIOMMU implementation is older and less featureful compared to VirtIO vIOMMU, but Intel vIOMMU seems to be what everyone posting here is using. Is there a reason for that? Is Intel's implementation actually better for certain use cases, or is it the case that the VirtIO implementation is still more experimental?

@uzumo has been experimenting with VIOMMU with Intel iGPU SR-IOV-based passthrough; it got me curious and made me realize I didn't really understand it yet.
 
  • Like
Reactions: leesteken
@uzumo has been experimenting with VIOMMU with Intel iGPU SR-IOV-based passthrough; it got me curious and made me realize I didn't really understand it yet.

I have no idea how these settings affect the operating speed.

It doesn't happen in my environment with ProxmoxVE8, but it is very possible that it was just slow at times with Proxmox VE 9 for some reason.

I did confirm that it was running much slower with Proxmox VE 9, which was created with the same settings except for Proxmox VE 8 and viommu.

However, when I tried it afterwards, the speed was not slow at any combination of settings.
 
Last edited:
  • Like
Reactions: SInisterPisces
I have no idea how these settings affect the operating speed.

It doesn't happen in my environment with ProxmoxVE8, but it is very possible that it was just slow at times with Proxmox VE 9 for some reason.

I did confirm that it was running much slower with Proxmox VE 9, which was created with the same settings except for Proxmox VE 8 and viommu.

However, when I tried it afterwards, the speed was not slow at any combination of settings.
Thanks. :)

I'm just trying to figure out when I'd want to use it.
Anecdotally, it's never been mentioned on any of the guides I've used tor SR-IOV-based PCIe passthrough for NICs or iGPUs.
But suddenly everyone's talking about it and I feel like I've missed something.
 
Enabling vIOMMU is only necessary if you have a guest that’s an hypervisor itself that needs IOMMU. So nested virtualization (eg a Hyper-V test bed) and you also want it to pass through the hardware it was passed through.
 
Enabling vIOMMU is only necessary if you have a guest that’s an hypervisor itself that needs IOMMU. So nested virtualization (eg a Hyper-V test bed) and you also want it to pass through the hardware it was passed through.
Thanks!

This is what I thought, but I started to second-guess myself a bit after seeing so many posts about it.

I suppose that I didn't realize how many people here were doing nested virtualization. :)
 
Enabling vIOMMU is only necessary if you have a guest that’s an hypervisor itself that needs IOMMU. So nested virtualization (eg a Hyper-V test bed) and you also want it to pass through the hardware it was passed through.
I've been thinking about this a bit more, and I'm curious if Docker or Podman count as nested virtualization for vIOMMU purposes, as well.

Probably not, but I'm curious. :)
 
No, Docker/Podman uses kernel mode containers which are effectively running the applications natively, only pretending to the application they are isolated (contained by cgroups). Although on Windows guests, Docker does launch a Linux VM so you do need nested virtualization then.
 
Last edited:
Hi @SInisterPisces,

The first thing to understand is what an IOMMU is:

An IOMMU is hardware that manages how devices access memory. It translates device I/O virtual addresses into physical RAM addresses and enforces isolation so that devices can only access the memory regions assigned to them. You can think of it as an MMU (Memory Management Unit) for devices instead of the CPU.

In virtualization, a vIOMMU provides the same isolation at the VM level. Each VM gets its own "address space," allowing physical devices passed through to VMs to remain properly isolated. This is especially important when using features like SR-IOV, where virtual functions of a network device are directly assigned to guest VMs.

It's worth pointing out that vIOMMU isn't always required for passthrough, but it becomes necessary when the guest needs to program the IOMMU directly (common with SR-IOV).

For reference, Intel and AMD use different names for their IOMMU virtualization capabilities: VT-d (Intel) and AMD-Vi (AMD). And, yes, it carries non-negligible performance overhead.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: SInisterPisces
Thanks, @guruevi . That's how I was vaguely imagining Docker and Podman worked, but I wasn't clear on the technical implementation.

It's worth pointing out that vIOMMU isn't always required for passthrough, but it becomes necessary when the guest needs to program the IOMMU directly (common with SR-IOV).
@bbgeek17 Thanks for the succinct description of what an IOMMU group actually is. I've read whitepapers, but they tend to go so deep into implementation details that the core of what's actually happening (IO memory management) gets kind of lost. I don't blame anyone for just figuring out that "I need it for PCIe passthrough," how to turn it on, and then hoping not to think about it again. ;)

I'm curious about the quoted part of your message, above.
I use SR-IOV-based PCIe passthrough for my Intel i7-12700T's iGPU. I get 7 VFs off of it using a hacked DKMS i915 driver on the host. VMs using the passed through VFs work, though I do get a ton of warnings and weird output on the host PVE node's console.

Is it possible that enabling vIOMMU for VMs using those VFs would make them work better (maybe by reducing overhead/translation for memory access)? Is there a good way to test that other than A/B testing with a 3D benchmark app inside the VM?
 
Hi @SInisterPisces,

vIOMMU does not generally improve the raw performance of SR-IOV GPU virtual functions; SR-IOV itself is designed to bypass much of the hypervisor overhead by giving VMs direct access to hardware, and vIOMMU is primarily focused on improving security, memory isolation, and compatibility, not throughput or latency.

That said, if there is one thing I can tell you, it's that vIOMMU is REALLY good at finding incorrectly implemented drivers :-) Because it enforces protections.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Hi @SInisterPisces,

vIOMMU does not generally improve the raw performance of SR-IOV GPU virtual functions; SR-IOV itself is designed to bypass much of the hypervisor overhead by giving VMs direct access to hardware, and vIOMMU is primarily focused on improving security, memory isolation, and compatibility, not throughput or latency.

That said, if there is one thing I can tell you, it's that vIOMMU is REALLY good at finding incorrectly implemented drivers :-) Because it enforces protections.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Thanks again. One thing I'm still not clear on, or rather managed to confuse myself about again: would you still only use vIOMMU for nested virtualization, or would you also use it with non-nested virtualization using SR-IOV, to "IOMMU is primarily focused on improving security, memory isolation, and compatibility?" I'm trying to decide if it's worth experimenting with on an otherwise working VM that uses SR-IOV iGPU passthrough. I suspect the DKMS driver might object, in any case. It's an unofficial package based on an official package that doesn't get a lot of love inside Intel.

Do you have any advice on when, if using vIOMMU, VirtIO vIOMMU is preferable to the Intel vIOMMU implementation inside Proxmox? It sounds like the VirtIO vIOMMU is newer and more modern and easier to set up, but that the Intel vIOMMU subsystem is older and better tested.
 
You should only need a virtual IOMMU in the guest if you have software/drivers running in the guest that expect to be able to program an IOMMU.

It is my understanding that you only need a vIOMMU if you are attempting nested virtualization (i.e., running another virtual machine inside your virtual machine) along with PCI passthrough (i.e., "nested pci passthrough).

For basic nested virtualization (i.e., running virtual machines inside other virtual machines for standard workloads), a virtual IOMMU is not required.

For basic PCI passthrough (i.e., passing through a virtual function to a guest), a virtual IOMMU is not required.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
You should only need a virtual IOMMU in the guest if you have software/drivers running in the guest that expect to be able to program an IOMMU.

It is my understanding that you only need a vIOMMU if you are attempting nested virtualization (i.e., running another virtual machine inside your virtual machine) along with PCI passthrough (i.e., "nested pci passthrough).

For basic nested virtualization (i.e., running virtual machines inside other virtual machines for standard workloads), a virtual IOMMU is not required.

For basic PCI passthrough (i.e., passing through a virtual function to a guest), a virtual IOMMU is not required.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Thanks again. I'd managed to get myself pretty confused even while reading this thread.

This is exactly what I needed. :)