[SOLVED] is GPU passthrough working?

rhobinn

New Member
Sep 29, 2022
5
0
1
Im having trouble getting the passthrough right... I have done this for a few computers (all with intel processors)...
but this one is eluding me

First, this is my hardware:

AMD ryzen 3900x
Gigabyte 570x gaming X
GeForce RTX 3080Ti GAMING X TRIO

I'm using Proxmox 7.2-3 with kernel:
Linux 5.15.30-2-pve #1 SMP PVE 5.15.30-3 (Fri, 22 Apr 2022 18:08:27 +0200)


My bios is updated to the last version (F37d), and have this settings (have tried):
IOMMU: Enabled
CSM: Disabled
Above 4G decoding: Enabled
Resizable BAR: Disabled
ACS: Enabled

I have another GPU (cuadro p620) but that's only for monitor output... and the bios is set to output on this one
the 3080 card seems to be on its own IOMMU Group (25)

these are the result of some typical commands for this stuff:

lspci -nnv | grep VGA
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GL [Quadro P620] [10de:1cb6] (rev a1) (prog-if 00 [VGA controller]) 0a:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2208] (rev a1) (prog-if 00 [VGA controller])

lspci -s 0a:00 -n
0a:00.0 0300: 10de:2208 (rev a1) 0a:00.1 0403: 10de:1aef (rev a1)

find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/0/devices/0000:00:01.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.2 /sys/kernel/iommu_groups/2/devices/0000:00:02.0 /sys/kernel/iommu_groups/3/devices/0000:00:03.0 /sys/kernel/iommu_groups/4/devices/0000:00:03.1 /sys/kernel/iommu_groups/5/devices/0000:00:04.0 /sys/kernel/iommu_groups/6/devices/0000:00:05.0 /sys/kernel/iommu_groups/7/devices/0000:00:07.0 /sys/kernel/iommu_groups/8/devices/0000:00:07.1 /sys/kernel/iommu_groups/9/devices/0000:00:08.0 /sys/kernel/iommu_groups/10/devices/0000:00:08.1 /sys/kernel/iommu_groups/11/devices/0000:00:14.3 /sys/kernel/iommu_groups/11/devices/0000:00:14.0 /sys/kernel/iommu_groups/12/devices/0000:00:18.3 /sys/kernel/iommu_groups/12/devices/0000:00:18.1 /sys/kernel/iommu_groups/12/devices/0000:00:18.6 /sys/kernel/iommu_groups/12/devices/0000:00:18.4 /sys/kernel/iommu_groups/12/devices/0000:00:18.2 /sys/kernel/iommu_groups/12/devices/0000:00:18.0 /sys/kernel/iommu_groups/12/devices/0000:00:18.7 /sys/kernel/iommu_groups/12/devices/0000:00:18.5 /sys/kernel/iommu_groups/13/devices/0000:01:00.0 /sys/kernel/iommu_groups/14/devices/0000:02:01.0 /sys/kernel/iommu_groups/15/devices/0000:02:02.0 /sys/kernel/iommu_groups/16/devices/0000:02:03.0 /sys/kernel/iommu_groups/17/devices/0000:02:04.0 /sys/kernel/iommu_groups/18/devices/0000:07:00.0 /sys/kernel/iommu_groups/18/devices/0000:02:08.0 /sys/kernel/iommu_groups/18/devices/0000:07:00.3 /sys/kernel/iommu_groups/18/devices/0000:07:00.1 /sys/kernel/iommu_groups/19/devices/0000:08:00.0 /sys/kernel/iommu_groups/19/devices/0000:02:09.0 /sys/kernel/iommu_groups/20/devices/0000:09:00.0 /sys/kernel/iommu_groups/20/devices/0000:02:0a.0 /sys/kernel/iommu_groups/21/devices/0000:03:00.0 /sys/kernel/iommu_groups/22/devices/0000:04:00.1 /sys/kernel/iommu_groups/22/devices/0000:04:00.0 /sys/kernel/iommu_groups/23/devices/0000:05:00.0 /sys/kernel/iommu_groups/24/devices/0000:06:00.0 /sys/kernel/iommu_groups/25/devices/0000:0a:00.0 /sys/kernel/iommu_groups/25/devices/0000:0a:00.1 /sys/kernel/iommu_groups/26/devices/0000:0b:00.0 /sys/kernel/iommu_groups/27/devices/0000:0c:00.0 /sys/kernel/iommu_groups/28/devices/0000:0c:00.1 /sys/kernel/iommu_groups/29/devices/0000:0c:00.3 /sys/kernel/iommu_groups/30/devices/0000:0c:00.4

My sistem supports remapping

dmesg | grep 'remapping'
[ 0.370033] x2apic: IRQ remapping doesn't support X2APIC mode [ 0.635832] AMD-Vi: Interrupt remapping enabled

I added to /etc/modules the modules suggested
vfio vfio_iommu_type1 vfio_pci vfio_virqfd

I have tried different things in the fail /etc/default/grub
but if I put

GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on"

I get this from dmesg | grep -e DMAR -e IOMMU
[ 0.629125] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported [ 0.634719] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 0.635124] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

I have tried different options:
-quiet
-vfio-pci.ids=10de:1e81,10de:10f8,10de:1ad8,10de:1ad9
-modprobe.blacklist=radeon,nouveau,nvidia,nvidiafb,nvidia-gpu"
-
and every time I get exactly the same message

only when I add:
-pcie_acs_override=downstream,multifunction
it adds the line to the dmesg results:
[ 0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA

but I cannot seem to be able to enable IOMMU

but...
by mistake I once used
intel_iommu=on

and then I got

dmesg | grep -e DMAR -e IOMMU

[ 0.084082] DMAR: IOMMU enabled [ 0.627289] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported [ 0.632809] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40 [ 0.633146] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

and I don't know if that is actually correct...
why does this happen? it is not an intel processor... why does IOMMU turns on when I set intel_iommu=on??
is it a false positive?
or that's correct and I should use intel_iommu=on?

what should I do?

to test the GPU on the VM...
im gonna be running a simulation software GROMACS, and installing it and compiling it is long...
I will try it to see if with the intel ON setting works...

I also found this guide
https://pve.proxmox.com/wiki/Pci_passthrough
and in the end says these:
Start the VM and enter the qm monitor onn the CLI: "qm monitor vmnumber" Verify that your card is listed here: "info pci" Then install drivers on your guest OS.
and I have no idea what how to do it...

so what do you think is happening? why IOMMU only works when I set it to intel instead of AMD??
 
Last edited:
amd_iommu is on default, as I tried to explain in your other very similar thread.
Please don't use pcie_acs_override=downstream,multifunction as does introduce a security issue and you really don't need it with a X570 motherboard.
It's really simple IOMMU is on, otherwise you would not have multiple iOMMU groups and not be able to start the VM with passthrough.
 
  • Like
Reactions: rhobinn
Regarding the GPU passthrough (which is probably working):
Do you have a physical display connected to it? Does it display anything?
DId you make sure that the system boot and Proxmox host console are using the Quadro P620? You can select this in the Gigabyte BIOS.
Can you please attach the VM configuration file of your VM? Did you enable Primary GPU?
Do you also passthrough a USB controller or ports to have a keyboard and mouse to use inside the VM?
What operating system did you install inside the VM?
 
  • Like
Reactions: rhobinn
amd_iommu is on default, as I tried to explain in your other very similar thread.
Please don't use pcie_acs_override=downstream,multifunction as does introduce a security issue and you really don't need it with a X570 motherboard.
It's really simple IOMMU is on, otherwise you would not have multiple iOMMU groups and not be able to start the VM with passthrough.
I removed amd_iommu and pcie_acs_override...

Sorry about the double post, I created one and realized a mistake, edited it and boom it was gone.. I went to my profile and saw that the post disappeared... so I created another one.. my bad...... I think it takes some time to appear back after editing...

Regarding the GPU passthrough (which is probably working):
Do you have a physical display connected to it? Does it display anything?
I have a monitor connected to the Quadro and displays the booting process, and then the Proxmox terminal...
DId you make sure that the system boot and Proxmox host console are using the Quadro P620? You can select this in the Gigabyte BIOS.
Yes, I made the Quadro the 'initial display' in the BIOS settings..

Can you please attach the VM configuration file of your VM?
balloon: 0
boot: order=scsi0;ide2;net0
cores: 24
hostpci0: 0000:0a:00.0
hostpci1: 0000:0a:00.1
ide2: none,media=cdrom
kvm: 0
memory: 45000
meta: creation-qemu=6.2.0,ctime=1664403597
net0: virtio=B4:2E:99:A2:6E:02,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local:100/vm-100-disk-0.qcow2,size=200G
scsihw: virtio-scsi-pci
smbios1: uuid=e6ce6e2a-3b4e-4ba4-94e1-4788379055e5
sockets: 1
vmgenid: 1667cfb5-4dd5-4a5d-9967-be2dd914f44d

Did you enable Primary GPU?
No
Do you also passthrough a USB controller or ports to have a keyboard and mouse to use inside the VM?
What operating system did you install inside the VM?
I use ubuntu server, just SSH so I didn't move anything regarding to USB

If you are having problems with a VM with passthrough, it's not because IOMMU is not enabled but probably because the GPU is used during boot and by Proxmox. The various work-arounds are in this thread from several months ago.
This is from you're answer in the other post... and I don't know if this still applies if the Quadro is used during boot instead of the 3080Ti



More about my problem:
In linux I installed the Nvidia drivers and if I run watch-nvidia I can see the graphics card

| NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 |
| 0 NVIDIA GeForce ... On | 00000000:00:10.0 Off | N/A |
| 0% 46C P8 3W / 350W | 1MiB / 12288MiB | 0% Default |

I use the graph card to make molecular simulations so I use a software called GROMACS, but when I run the test the graphic card stays at 0% of usage...

I've tried the same setup on metal and it works fine and I have also run it on a VM in other computers... But I understand that using this software as a tests adds a lot of variables that are harder to control.... So I'm wondering if there is an easier way to see if the system can use the graphics card...
 
Try enabling Primary GPU, that usually makes passthrough of NVidia GPUs work. That also disables the virtual display, so you should see output on an attached monitor. That would also show that the system can use that GPU.

You don't need to passthrough both functions of the GPU (VGA+audio), just passthrough the GPU part (0a:00.0) and enable All Functions in the Proxmox GUI (near the Primary GPU setting).
Enable PCI Express for the passthrough (which requires machine type q35) as the NVidia drivers might expect/assume that.
kvm: 0 is weird (and makes it slow?) and I don't think you need it if you enable Primary GPU.
The memory size is weird (not a power of 2 or a multiple of 1024) and make sure it's not too big because all VM memory must be pinned into actual host RAM because of passthrough.

I don't know that software and have no experience with NVidia, sorry.

PS: Yes, when posting for the first few times post do vanish for a while. It's also confusing for me when replying and not being able to post it because the thread no longer exists. It's probably temporarily invisible for normal users of the forum and awaiting moderation by the staff.
 
Last edited:
  • Like
Reactions: rhobinn
Try enabling Primary GPU, that usually makes passthrough of NVidia GPUs work. That also disables the virtual display, so you should see output on an attached monitor. That would also show that the system can use that GPU.

You don't need to passthrough both functions of the GPU (VGA+audio), just passthrough the GPU part (0a:00.0) and enable All Functions in the Proxmox GUI (near the Primary GPU setting).
Enable PCI Express for the passthrough (which requires machine type q35) as the NVidia drivers might expect/assume that.
kvm: 0 is weird (and makes it slow?) and I don't think you need it if you enable Primary GPU.
The memory size is weird (not a power of 2 or a multiple of 1024) and make sure it's not too big because all VM memory must be pinned into actual host RAM because of passthrough.

I don't know that software and have no experience with NVidia, sorry.

PS: Yes, when posting for the first few times post do vanish for a while. It's also confusing for me when replying and not being able to post it because the thread no longer exists. It's probably temporarily invisible for normal users of the forum and awaiting moderation by the staff.

I had KVM:0 because it wouldn't run with it activated... I just learned that it was possible it activate visualization in my motherboard. It was really nested into the options of the Motherboard and I was looking for something with 'virtual' in the name instead of SVM,so dumb.

I also learned the differences between q35 and i440fx. So I created a new q35 VM... installed, compiled my software....only passthrough VGA.. enabled all function and PCI express... adjusted the memory size... and it worked!
1 minute later I smelled the scent of smoke and the power supply was fried haha...
but it fried because it worked!!

thank you, you're great!
 
I had KVM:0 because it wouldn't run with it activated... I just learned that it was possible it activate visualization in my motherboard. It was really nested into the options of the Motherboard and I was looking for something with 'virtual' in the name instead of SVM,so dumb.
Happens to others also but not having hardware virtualization support enabled (SVM, VT-x) was probably part of the passthrough problem (and a clue).
I also learned the differences between q35 and i440fx. So I created a new q35 VM... installed, compiled my software....only passthrough VGA.. enabled all function and PCI express... adjusted the memory size... and it worked!
I would expect Ubuntu to not have a problem switching between the machine types, but glad to see you got it working.
1 minute later I smelled the scent of smoke and the power supply was fried haha...
but it fried because it worked!!
Glad you found it funny and it did not break other stuff or burn down the house. It's the first time I broke someones hardware from a distance and makes a good story.
 
  • Like
Reactions: rhobinn

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!