Problem Passing GPU to WIndows VM;

Bytales

Member
Oct 8, 2018
34
1
6
40
I have managed to set up a windows VM within Proxmox. I gave it two physicals NVMEs , installed VirtIO drivers, gave it mouse nad keyboard. All is well, except for the GPU passtrough, which is the most important thing for me for making this VM. The VM is uselless for me if i cant pass the GPU.

Here are the settings i have made

1)My /etc/default/grub file has
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"

2)/etc/modules are updated to contain
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

3)running
Code:
dmesg | grep AMD-Vi
lists me more lines, one of them says
AMD-Vi: Interrupt remapping enabled

4)IOMMU Isolation.
The GPU i want to passthrough, (AMD VEGA Frontier Edition Watercooled) is in a single IOMMU group, as is its SoundCard, as can be seen by running these comannds:

Code:
lspci | grep VGA
49:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
4c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 6863
64:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 6863

Code:
lspci | grep 4c
4c:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 6863
4c:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf8

Code:
find /sys/kernel/iommu_groups/ -type l | grep 4c
/sys/kernel/iommu_groups/66/devices/0000:4c:00.0
/sys/kernel/iommu_groups/67/devices/0000:4c:00.1

The GPU and its soundcard are in their IOMMU groups 66 and 67
I have verified that these devices are alone in their IOMMU groups.

5)Blacklisting Radeon
/etc/modprobe.d/blacklist.conf contains
blacklist radeon

6)My /etc/modprobe.d/vfio.conf contains
options vfio-pci ids=1002:6863,1002:aaf8
because
Code:
lspci -n -s 4c:00
gives me
4c:00.0 0300: 1002:6863
4c:00.1 0403: 1002:aaf8

The Problem here is:
The second GPU which im going to try to pass to the second VM
64:00.0 (Second GPU Vega frontier)
If i run
Code:
lspci -n -s 64:00
gives me
64:00.0 0300: 1002:6863
64:00.1 0403: 1002:aaf8

Therefore my VM conf contains
Code:
hostpci0: 4c:00.0,pcie=1,x-vga=on
hostpci1: 4c:00.1,pcie=1

With these settings, the VM starts, but i get no Image on the monitor attached to the GPU. Also th VNC console doesnt show any image, presumably because the Image is outputed through the GPU pcie device.

Questions:
1)Should i also be getting an image in the VNC console, if everything is passthroughed ok ?
2)Is there a problem that both of my GPUs, allthough they have different PCI address, have exactly similar IDs? Couldnt i input in the VFIO.CONF as IDs a complet PCI address with ID ?
Because that way i would write
options vfio-pci ids=4c:00.0/1002:6863,4c:00.1/1002:aaf8 (If that is the correct syntax - i dont know)
But i would need to specify exactly, because both cards have same id.
3)Bios is OVMF, EFI disk is in order, if i remove the PCI devices from the VM, the VM starts ok, and i get an image on the VNC Console, and windows boots, and i am in windows.

As soon as i add the pci devices, i get the problem above.
Perhaps i am doing something wrong ?

ANy help would be apreciated. I have managed to set everything i need (apart from the passtrough of the optical disc) and this GPU passtrough would be the last thing i would need to get working.

I sure hope it is possible.
Any ideeas are deeeeeeeeply apreciated. I have been trying to make these VMs happening for the past 3 Weeks, going on a month now, started with UNRAID, ESXi, and now PROXMOX: Decided to remain with Proxmox, as it seems to most comprehensive of all. Im only hopefull i will be able to make the VMs work with the GPU passthrough.
 
Also Running
Code:
dmesg | grep ecap
lists
[ 96.477085] vfio_ecap_init: 0000:4c:00.0 hiding ecap 0x19@0x270
[ 96.477096] vfio_ecap_init: 0000:4c:00.0 hiding ecap 0x1b@0x2d0
[ 301.202824] vfio_ecap_init: 0000:4c:00.0 hiding ecap 0x19@0x270
[ 301.202836] vfio_ecap_init: 0000:4c:00.0 hiding ecap 0x1b@0x2d0
[ 1869.592814] vfio_ecap_init: 0000:4c:00.0 hiding ecap 0x19@0x270
[ 1869.592832] vfio_ecap_init: 0000:4c:00.0 hiding ecap 0x1b@0x2d0

Shit does this mean , Intrerupt remapping is not supported ?

Well, it seems, the image did show up on my monitor, only that a lot longer time was required. In windows, i have as video adapter a single video adapter, the Vega Frontier Edition. Werent there suppose to be another "Standard VGA adapter" made by the "Emulation" that the VM does for a video adater ?
 
Well, it seems, the image did show up on my monitor, only that a lot longer time was required. In windows, i have as video adapter a single video adapter, the Vega Frontier Edition. Werent there suppose to be another "Standard VGA adapter" made by the "Emulation" that the VM does for a video adater ?
no, if you add 'x-vga=1' to the hostpci line, it tells proxmox that this is the primary (and only gpu) in the vm
 
Conclusion here is that it works BUT

1)It probably takes a bit to receive Image on Monitor, so one must give it time, for the Image to arrive.
2)If it still doesnt work. Make sure the VM starts correctly without the PCI GPU, and then set the PCI GPU as passed through, and set the V-machine on autostart, to start with the host.
3)Restart, and wait. and Image should appear on the Monitor.

My host boots ProxMOX in legacy, but the machine is UEFI.
I think there are Problems booting PROMOX in UEFI mode. I had Trouble with it, and let it Legacy, and it works.

The VM machine itself has OVMF Bios.
 
It seems the key to making this work was in the code

Code:
hostpci0: 4c:00.0,pcie=1,x-vga=on
hostpci1: 4c:00.1,pcie=1
was correct

where i tried at first
Code:
hostpci1: 4c:00, pcie=1, x-vga=on

Then i saw that the GPU and ist Sound Chip Counterpart where in different IOMMU Groups.

What does "pcie=1" mean ?

How can i find all the arguments a function has ?
like the "hostpci" function ? Is there a Manual i can look up ?
 
Then i saw that the GPU and ist Sound Chip Counterpart where in different IOMMU Groups.
should work nonetheless, maybe a quirk from the gpu

What does "pcie=1" mean ?
it means that it attaches the card to a virtual pci express slot (instead of pci), this only works with q35 (since the default platform does not have any pcie slots)
and may not be necessary (some guest os/drivers have a problem when they see a pcie card on a pci slot)

How can i find all the arguments a function has ?
man qm

generally in the top right of the webgui you find a link to 'Documentation' which has a long and extensive reference documentation
 
should work nonetheless, maybe a quirk from the gpu
it means that it attaches the card to a virtual pci express slot (instead of pci), this only works with q35 (since the default platform does not have any pcie slots)
and may not be necessary (some guest os/drivers have a problem when they see a pcie card on a pci slot)

So both the GPU and the Sound Card are on the virtual PCIE numbered as number 1 ?
Code:
hostpci0: 4c:00.0,pcie=1,x-vga=on
hostpci1: 4c:00.1,pcie=1

Shouldnt i have written:
Code:
hostpci0: 4c:00.0,pcie=1,x-vga=on
hostpci1: 4c:00.1,pcie=2

I am sking because i plan to add more pcie devices, NVME Controller and usb Controller, Should i number These also with number 1 ? Or would i Need to number them with different numbers ?
 
'pcie' is not really a number only a flag (0/1, yes/no)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!