Query Driver State

robert.s

Member
Sep 16, 2021
7
0
6
40
I have 3 VM connected to GPU via passthrough method.
And all 3 GPUs are identical (holding same vendor-ID but in different IOMMU group).

All 3 VM is working fine over RDP(in parallel) with little bit of performance hit. But as soon as i try to connect with parsec or other remote protocol(dependent on GPU) then performance starts to drop heavily and OpenGL version are also getting downgrade from 4.6 to 1.1 (confirmed by GPU-Z).

So tell me how to confirm that GPU is running over guest driver(not over Host) and completely unbound isolated from host ?
 
Not sure what you are asking, but if you get OpenGL 1.1 it's the Windows software driver and OpenGL 4.6 is (typically) only achieved by a driver for the actual hardware.
I believe that Parsec and such require a physical display to be connected or a plug that emulates the presence of a physical display.
 
My proxmox is running over AMD EPYC with 3 Nvidia RTX 3070 card. And i tried to passthrough all 3 cards to 3 different VM ( by following PCI passthrough official documents).

Now steps mentioned in documents are working only with single GPU or Multiple GPU with different vendorID. In my case, vendorID is identical. So GPU is somehow still bound to host and not isolated completely. And due to this reason I'm not able to work with GPU dependent application.

So tell me how can i confirm that GPU is isolated or not ? And if not then how can i isolate completely ? What extra step would required to do this ?
 
You can check the IOMMU groups with this command: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done. If they are in a group without other devices (don't count bridges) then they are securely isolated. If they are not, try putting them in other PCIe slots. If that does not help, you can ignore the IOMMU groups using the pcie_acs_override=downstream,multifunction kernel parameter.

I must have understood your first question wrong then, sorry. I thought you said it worked fine with passthrough and they are in different IOMMU groups. I don't understand how your second questions relates to OpenGL versions and Parsec being slow in the first question. That is not related to PCIe isolation (ACS) but probably depends on a (fake) monitor plugged into the GPU.
 
Yes, All 3 cards are in different IOMMU groups. And right now no monitor or headless DP is connected to system.


Group: 11 0000:c1:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070] [10de:2484] (rev a1) Driver: vfio-pci
Group: 11 0000:c1:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1) Driver: vfio-pci

Group: 27 0000:81:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070] [10de:2484] (rev a1) Driver: vfio-pci
Group: 27 0000:81:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1) Driver: vfio-pci

Group: 43 0000:41:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070] [10de:2484] (rev a1) Driver: vfio-pci
Group: 43 0000:41:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1) Driver: vfio-pci
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!