[TUTORIAL] GPU Passthrough - Radeon 6800xt and beyond.

potts202p

New Member
Jan 17, 2021
4
0
1
25
Supporting links:

[The bulk of the steps]

https://blog.quindorian.org/2018/03/building-a-2u-amd-ryzen-server-proxmox-gpu-passthrough.html/

This tutorial draws heavily on the one above – however it has been modified to relay how to pass through a GPU using many PCI buses.

This tutorial also addresses issues with IOMMU splitting which the original did not.

Additionally, several steps have been modified as they were outdated in the previous tutorial.



Change boot parameters

Inside of /etc/default/grub change the following line to include:

GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on video=efifb:eek:ff"
(the emoji is ": o" - but remove the middle space)


This changed the grub boot parameters to enable IOMMU for AMD in PT mode. It also parses a video command essential to making the first slot available for GPU passthrough.

After changing the line run a:

upgrade-grub



to make sure the changes are taken into account after you reboot. After the command completes, do a reboot.

When rebooted (Make sure IOMMU is enabled in your BIOS) check to see if IOMMU is now active:

root@duhmedia:/etc/default# dmesg | grep -e DMAR -e IOMMU

This command should Yield:

[ 0.594621] AMD-Vi: IOMMU performance counters supported

[ 0.596624] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40

[ 0.597487] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).

[ 14.039859] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

Blacklist drivers from loading​

You need to blacklist the drives for the card you want to use. I have completed a GPU passthrough with a Radeon 6800xt and Radeon 270X – Howerever this tutorial is a modified version of a Nvidia Passthrough, so either should work.

If you want to passthrough an Nvidia card do the following commands:

echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf

echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf

If you want to pass an AMD card do the following commands

echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf

At this point the drivers should be blacklisted.



When that is done enter the following commands to make sure these drivers are loaded during kernel initialization:

echo vfio >> /etc/modules

echo vfio_iommu_type1 >> /etc/modules

echo vfio_pci >> /etc/modules

echo vfio_virqfd >> /etc/modules

When that is done run the following to update the modules proxmox is using.:

update-initramfs -u



This next part is one that was tough to understand – In all previous tutorials I have seen a GPU has typically been confined to one “PCI BUS” (this may – or may not actually be a PCI BUS, but this is what I will refer to them in this tutorial.) – my Radean 6800xt used 3 buses. I am unsure if this is a quirk of my motherboard or a quirk of the card – but either way we are going to walk through how to solve this issue.

first update the PCI devices on the motherboard to ensure they properly reflect what is currently present:

update-pciids

We need to list the PCI devices that we want to pass through to our VM:

lspci -v

which will output a whole lot of information about all the PCI cards in your system:

04:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1) (prog-if 00 [Normal decode])

………………………….

Kernel driver in use: pcieport



05:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])

…………………………..

Kernel driver in use: pcieport



06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1) (prog-if 00 [VGA controller])

Subsystem: XFX Limited XFX Speedster MERC 319 AMD Radeon RX 6800 XT Black

…………………………….

Kernel driver in use: vfio-pci



06:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device ab28

…………………………….

Kernel driver in use: vfio-pci

Kernel modules: snd_hda_intel



06:00.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 73a6 (prog-if 30 [XHCI])

…………………………

Kernel driver in use: xhci_hcd

Kernel modules: xhci_pci



06:00.3 Serial bus controller [0c80]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 USB

………………………….

Kernel driver in use: vfio-pci



My particular card is using several PCI Buses (04:00.0, 05:00.0, 06:00.0) in addition, 06:00.0 also has components within itself – we will need all of these to get our GPU working. In may systems your GPU may not have 3 buses used – and that’s okay. We must now get the PCI ID of each of these BUS devices. We need. Remember, if you ever change anything in your device configuration (enable/disable sound card, USB, add a NIC, etc.) the order of the PCI devices can change and the following will need to be changed accordingly.

Run a:

lspci -n -s 06:00

This will give you the Bus ID for your card, below is an example of the 06:00.0 bus – note that 4 ID’s appear, this is the same number of items associated with 06:00 above.

root@pve:~# lspci -n -s 06:00

06:00.0 0300: 1002:73bf (rev c1)

06:00.1 0403: 1002:ab28

06:00.2 0c03: 1002:73a6

06:00.3 0c80: 1002:73a4

Note down the right most set of 8 numbers for each line (1002:73bf) is the set in the first line.

Do the same lspci -n -s XX:XX command for the PCI devices in question (05:00 and 04:00 for us)



Now that we know that, run the following commands with the ID’s you marked down (the set of 8 numbers) – I only wrote down the first couple, do not actually put “…” instead of a real ID…

echo options vfio-pci ids=: 1002:73bf, 1002:ab28,…,…,…,…,…,… disable_vga=1 > /etc/modprobe.d/vfio.conf

Once that is done, reboot your system.

Once the reboot is complete, use lspci -v to check that the card is now using the vfio-pci instead of any NVidia driver as can be seen in my example above.



Creating the VM​

Create a new virtual machine inside of Proxmox. During the wizard make sure to select these things:

  • Create the VM using “SCSI” as the Hard Disk controller
  • Under CPU select type “Host
  • Under Netwerk select Model “VirtIO
  • Under Options, change the BIOS to “OVMF (UEFI)”
  • Then under Hardware click Add and select “EFI Disk
  • In system, Under machine – unsure you have “q35” selected
  • OPTIONAL: I have found that it is far easier to get a UEFI windows setup working if you boot from a USB with a clean .ISO (“borrowed” ISO’s do not work) – If you have never created a UEFI windows setup I suggest you download the “windows creation tool” and flash a fresh ISO on a USB. This has been the best way I have found to get windows working with UEFI. If you decide to go this route set the bootable media (in the OS tab) to “no bootable media”.
You may think that these settings are not important, but they will literally make or break this tutorial so ensure they are set up exactly as stated above.

After the wizard completes, we need to change a few things:

  • Next to linking the default DVD-ROM drive to a Windows 10 ISO (if you are passing through to windows), create a second DVD-ROM drive and link the VFIO driver ISO , you will need it while installing windows. Make sure the second DVD-ROM drive is assigned IDE 0
  • If you decided to do the optional step above, take your fresh windows usb and plug it into your machine – go to your VM’s hardware settings and add the USB. Go to options -> boot order and ensure it is both enabled and first in the boot order.

Make sure you can remote control​



Once you are in windows, ensure that you install the VFIO drivers – to do this go to your “PC” and select the appropriate disk, at the bottom there should be an virtio-win-guest-tools.exe to install the drivers – this will allow us to remote into the machine. There are several tutorials on how to do this in depth – please refer to those if this step lost you. you need to make sure of, that you can remotely login to the VM. The reason why is because if you boot the VM with GPU Passhtrough, it disables to standard VGA adapter after which the built-in VNC/Spice from Proxmox will no longer work. You can use RDP to accomplish this or any other remote access method of your choice. If you decide to go the RDP route ensure you go to “Remote Desktop Settings” and check that “enable remote desktop” is checked.

Once that is done, shutdown the VM, we need to make another config change!



Separating IOMMU groups​



Note: This next step can be considered a slight security risk by some – as I understand it may allow GPUS to “talk to eachother”. This will likely not be an issue for you and should not cause functional problems but may be a consideration if security on VM with a GPU passthrough is of the upmost importance.

This next section may be optional depending on your system. When you pass in a PCI device – you pass in all the devices in that PCI devices IOMMU group. This can cause issues for some as oftentimes many system critical functions will be linked to the same IOMMU group as your PCI device. If you do not separate your IOMMU groups this will cause your system to crash. NOTE: The system will only crash when you try to boot the “ problem VM” that is causing the IOMMU problem – to ensure this does not brick your system MAKE SURE IT IS NOT SET TO BOOT AT STARTUP (yet).

To determine if you will have this problem run the following command:

for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done;



The command above lists the IOMMU groups in your system, PCI Buses that your GPU uses are shared with other devices on the system (that are not related to the GPU in question) we will need to split up these groups before continuing. To split up your IOMMU groups do the following:

Navigate to grub with “nano /etc/default/grub”. And insert the following line after: amd_iommu=on

INSERT THIS: pcie_acs_override=downstream,multifunction

The full line should look something like this:

GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt amd_iommu=on pcie_acs_override=downstream,multifunction video=efifb:eek:ff"

You can now exit out of nano with “ctrl-x” and run the following command:

update-grub

Restart the system and run the following command again:

for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done;

If you did everything properly you should have all your devices split into different IOMMU groups.



Passing through the GPU​

Now that we have everything prepared, we can finally pass through the GPU

Navigate to the hardware section of your VM and add a PCI device. Your GPU should be listed – or you may need to find the bus that is associated with your GPU (recall our GPU related bus’s where 04:00, 05:00 and 06:00). If your GPU has many Buses like mine YOU DO NOT NEED TO PASS THEM ALL THROUGH. It is likely that only one of the possibly many buses related to your GPU appears – that is fine, choose that one only.

Next unser that the GPU is set to “Non-Primary” with 'All functions' and 'PCI Express' set.

Now Start VM and install the relevant drivers

Reboot then stop VM

Navigate to your VMs hardware section and set the GPU as primary GPU and set Display to VirtIO.

If all was done properly – your VM should not show it’s screen in the proxmox portal – You must Remote into it using “RDP” or the remote software you installed prior.

Congrats, you have finished your PCI passthrough.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!