proxmox 6.2 breaks gpu passthru

Hi,

Have you tried spinning up your linux VM with gpu passthru? Does it start up?


I'm running Ryzen 1700 with Nvidia 1050ti GPU and ROM image.

That response was more toward Joris, since I think he is using an AMD GPU.

I just spun up that Ubuntu VM from the grave. It was so old it was running 19.04 and I had to update Ubuntu to 19.10. Even though Ubuntu seemed to indicate it saw the GPU and had even updated the drivers, running sudo nvidia-gmi returned " Unable to determine the device handle for GPU 0000:00:01.0: Unknown Error "

I had to update my VM config to match my Windows VM which I hadn't had to do before:

These are the updates I had to make specifically:

Code:
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer,hv_tlbflush,hv_ipi,hv_vendor_id=GIGABYTE,kvm=off' -machine 'kernel_irqchip=on'
cpu: host,hidden=1,flags=+pcid

I also specified a romfile:

Code:
hostpci0: 02:00.0,pcie=1,romfile=nvidia1070.rom

Prior to this, I did not have to specify any args nor a romfile.

-TorqueWrench
 
That response was more toward Joris, since I think he is using an AMD GPU.

I just spun up that Ubuntu VM from the grave. It was so old it was running 19.04 and I had to update Ubuntu to 19.10. Even though Ubuntu seemed to indicate it saw the GPU and had even updated the drivers, running sudo nvidia-gmi returned " Unable to determine the device handle for GPU 0000:00:01.0: Unknown Error "

I had to update my VM config to match my Windows VM which I hadn't had to do before:

These are the updates I had to make specifically:

Code:
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer,hv_tlbflush,hv_ipi,hv_vendor_id=GIGABYTE,kvm=off' -machine 'kernel_irqchip=on'
cpu: host,hidden=1,flags=+pcid

I also specified a romfile:

Code:
hostpci0: 02:00.0,pcie=1,romfile=nvidia1070.rom

Prior to this, I did not have to specify any args nor a romfile.

-TorqueWrench

Not sure what you are saying here exactly. You mean you had to modify the VM.conf to have the PCI passthrough work on with an upgraded Proxmox or with an upgraded VM or both ?

How did you test the PCI-E passthrough to function properly ?
 
Not sure what you are saying here exactly. You mean you had to modify the VM.conf to have the PCI passthrough work on with an upgraded Proxmox or with an upgraded VM or both ?

How did you test the PCI-E passthrough to function properly ?

That VM hasn't been spun up since Proxmox 6.1 and wasn't run since Disco Dingo (Ubuntu 19.04) was a thing, so that makes it at least a year old. When I booted it up, the GPU was nonfunctional (determined by running sudo nvidia-gmi), though I could at least see it with lspci. It was functional when I last used it, however (again, about a year ago).

Since 19.04 is EOL, I upgraded the VM to 19.10 (which will be EOL in a month, btw), but still had the same error message: " Unable to determine the device handle for GPU 0000:00:01.0: Unknown Error ", so I went ahead and updated my VM.conf to match my Windows VM at which point it started working (again, determined by sudo nvidia-gmi and able to run a real workload with Handbrake).

Unfortunately, I can't tell you if the cause was the VM OS or Proxmox, only that it is once again working with the above changes.

By the way, I also get the vfio_ecap_init: hiding ecap message in dmesg, but it doesn't cause any problems for me and everything works fine:

[614626.213732] vfio-pci 0000:02:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[614626.217294] vfio-pci 0000:02:00.0: No more image in the PCI ROM

Again, I don't have an AMD GPU; I'm running a Xeon with an Nvidigia GPU on this server, so this may be an apples to oranges comparison.
 
Last edited:
well, i think the PCI-e passthrough is finally working.

THE (not so) miracle solution is adding nomodeset to the kernel boot parameter string

one stupifying observation is thestrings for teh PCI devices change on reboot, so 00:nn.0 and 00:nn.1 beome both 00:nn

The GPU is now definitely visible in the MS Windows VM and the driver installed.
except ... after some time or some reboots ...


[Sat Jun 6 17:41:31 2020] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=17:00.0 address=0x7fb99efd0]
[Sat Jun 6 17:41:32 2020] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=17:00.0 address=0x7fb99f010]
[Sat Jun 6 17:41:32 2020] iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=17:00.0 address=0x7fb99f030]
[Sat Jun 6 17:41:58 2020] vfio-pci 0000:17:00.0: Refused to change power state, currently in D3
[Sat Jun 6 17:41:59 2020] vfio-pci 0000:17:00.0: timed out waiting for pending transaction; performing function level reset anyway
[Sat Jun 6 17:42:00 2020] vfio-pci 0000:17:00.0: not ready 1023ms after FLR; waiting
[Sat Jun 6 17:42:01 2020] vfio-pci 0000:17:00.0: not ready 2047ms after FLR; waiting
[Sat Jun 6 17:42:03 2020] vfio-pci 0000:17:00.0: not ready 4095ms after FLR; waiting
[Sat Jun 6 17:42:08 2020] vfio-pci 0000:17:00.0: not ready 8191ms after FLR; waiting
[Sat Jun 6 17:42:16 2020] vfio-pci 0000:17:00.0: not ready 16383ms after FLR; waiting

Seems exactly what is happening here https://gitlab.freedesktop.org/drm/amd/-/issues/346
 
Last edited:
Hi,

I'm glad you got it working.

As for me, with my setup Ryzen 1700, Nvidia 1050Ti, I could not get GPU passthru working with my installation of proxmox 6.2 (in release upgrade from proxmox 6.1). I tried a lot of command combinations with no success. I might start with a fresh install of proxmox 6.2 in but for now I've got a refresh install of 6.1-3 running with my setup with GPU passthru functioning.
 
This seems a legit fix to most issues, just suffering from a VM which loses vncproxy when switching the console, thi scan be fixed with: systemctl restart pveproxy , not really how i hoped to fix it but hey.

kernel boot parameters to test: nobar pci=noats amdgpu.pm=0 nomodeset

include amd_iommu_v2 in /etc/modules on the first line

specific to the Nvidia GPU there is nvidia_drm.nomodeset=0 or something

--- still, not much improvement, the log is not longer abundant with error messages on D3 and such but the real bad ones do not not go away. These are why i created the bug report. See below.

For example, messages like, these are not specific to kernel boot paramters. They are specific to KVM guest, change the CPU from host to kvm64 to qemu64 and you see different such messages each time a VM starts.

[ 199.633423] kvm [13192]: vcpu1, guest rIP: 0xffffffffadaad41e ignored rdmsr: 0xc001100d
[ 199.633433] kvm [13192]: vcpu1, guest rIP: 0xffffffffadaad46e ignored wrmsr: 0xc001100d data 0x0
[Sat Jun 6 19:52:33 2020] vfio-pci 0000:17:00.0: Refused to change power state, currently in D3

Alternatively i found the system most quiet with such messsages when booting with the following the kernel boot paramters: pci=noats,big_root_window pcie_ports=native pcie_aspm=off

Give it a go ?
 
Last edited:
Holding wood, for now it appears GPU passthrough is working and stable. Tested with CPU = qemu64, kvm64

Sharing cmdline appends here, AMD+AMD specific but shoud be easily customisable.

iommu=pt amd_iommu=forced_isolation nobar video=efifb:off text nomodeset rcu_nocbs=0-15 amdgpu.runpm=0 pci=noats,big_root_window pcie_ports=native pcie_aspm=off

nobar - gets rid of some BAR: ... messages
rcu_nocbs = just, because, makes CPU handling cleaner with regards to irq handling
amdgpu.runpm= showed notable improvement with the AMD GPU handling, similar may exist for nvidia
pci= noats because of IOTLB notifications, big_root_window just because it works for BAR on some GPU
pcie_ports = native in an attempt to avoid reset notifcations
pcie_aspm = off because amdgpu.runpm=0 alone did not cut it

iommu=pt because recommended everywhere
amd_iommu=forced_isolation to make sure no interplay with PCI-e devices would occur

Good luck.
 
Having the same issue with an upgrade to ProxMox 6.2 from 6.1.. no GPU pass through. Tried updating grub using Joris' cmdline but didn't work. Also tried-
iommu=pt amd_iommu=forced_isolation video=vesafb:eek:ff,efifb:eek:ff rcu_nocbs=0-15 amdgpu.runpm=0 pci=nobar,noats,big_root_window pcie_acs_override=downstream,multifunction pcie_pme=nomsi pcie_aspm=force pcie_ports=native
Didn't work either.
I can see the VM bootup screen, but after about 5 seconds it goes dark. Any suggestions?
 
Having the same issue with an upgrade to ProxMox 6.2 from 6.1.. no GPU pass through. Tried updating grub using Joris' cmdline but didn't work. Also tried-
iommu=pt amd_iommu=forced_isolation video=vesafb:eek:ff,efifb:eek:ff rcu_nocbs=0-15 amdgpu.runpm=0 pci=nobar,noats,big_root_window pcie_acs_override=downstream,multifunction pcie_pme=nomsi pcie_aspm=force pcie_ports=native
Didn't work either.
I can see the VM bootup screen, but after about 5 seconds it goes dark. Any suggestions?

Hey, thanks for sharing.

While the string worked for my set-up the pass-through worked much simpler only after a kernel update. This is the string i append and use today, note the sequence is of some importance. the part in italic is optional and part of tests being performed.

iommu=pt amd_iommu=forced_isolation nobar video=vsafb:eek:ff video=efifb:eek:ff pci=pcie_bus_perf,pcie_scan_all,big_root_window pcie_ports=native

The VM screen going dark may be because you have not supplied a bios file to the GPU. Also, with the AMD x370 chipset you need a GPU reset script which suspends the machine and resumes it to regain control of the GPU.

The main issue i'm facing is AMD proprietary drivers appear to be boycotted from any distro i found and have a quirck when working with debian. This requires a not easy to find but stupid hack where the word debian must be changed to ubuntu in the /etc/os-release file

Like

PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
NAME="Debian GNU/Linux"
VERSION_ID="9"
VERSION="9 (stretch)"
VERSION_CODENAME=stretch
ID=ubuntu
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"



These links may prove useful:

https://pve.proxmox.com/wiki/Pci_passthrough#The_.27romfile.27_Option
https://wiki.installgentoo.com/index.php/PCI_passthrough

ROM FILES https://www.techpowerup.com/download/ati-atiflash/

Configuration required


Code:
/etc/modules

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

aufs
overlay

I trust you know to replace x, y, z with the correct information.
disable_vga and disabled_idel_d3 are optional

Code:
/etc/modprobe.d/

blacklist amdgpu

# GPU and AUDIO

options vfio-pci ids=xxxx:yyyyy,xxxxx:zzzz disable_vga=1 disable_idle_d3=1
options kvm ignore_msrs

Below a copy of the script, read carefully, no guarantees but it is required on this X370 based machine.


SCRIPT

Bash:
SCRIPT
#!/bin/bash
#
#replace xx\:xx.x with the number of your gpu and sound counterpart
#
#
echo "disconnecting amd graphics"
echo "1" | tee -a /sys/bus/pci/devices/0000\:xx\:xx.0/remove
echo "disconnecting amd sound counterpart"
echo "1" | tee -a /sys/bus/pci/devices/0000\:xx\:xx.1/remove
echo "entered suspended state press power button to continue"
echo -n mem > /sys/power/state
echo "reconnecting amd gpu and sound counterpart"
echo "1" | tee -a /sys/bus/pci/rescan
 
Last edited:
Wow, thanks Joris L.! For clarification, this is a Dell T7500 with Xeon L5640s and a Nvidia GTX 1060 GPU as the pass through video card. Let me go through the details from your post to see what may work.

Update: still nothing. The console is working (shouldn't be), so, the PCI passthrough is starts correctly, then stops. Again, this had been working for a while on 6.1.

Grub:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream,multifunction video=vesafb:off video=efifb:off"
#GRUB_CMDLINE_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX=""

my modprobe.d is empty

etc/modules

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
 
Last edited:
Wow, thanks Joris L.! For clarification, this is a Dell T7500 with Xeon L5640s and a Nvidia GTX 1060 GPU as the pass through video card. Let me go through the details from your post to see what may work.

Update: still nothing. The console is working (shouldn't be), so, the PCI passthrough is starts correctly, then stops. Again, this had been working for a while on 6.1.

Grub:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pcie_acs_override=downstream,multifunction video=vesafb:eek:ff video=efifb:eek:ff"
#GRUB_CMDLINE_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX=""

my modprobe.d is empty

etc/modules

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
  • Did you update the 6.2 to the most recent version ?
  • Did you consider supplementing the bios as a rom-option file to the VM ?
personally is use both iommu= and intel_iommmu= to assure the mode is set correctly.

For your setup: iommu=pt intel_iommu=on
also try by including the nomodeset paramater

you really need this in modprobe.d to have passthrough working, it may be the only line required afaik.

Code:
options vfio-pci ids=xxxx:yyyyy,xxxxx:zzzz disable_vga=1 disable_idle_d3=1
options kvm ignore_msrs

the vfio-pci line is how passthrough actually is enabled, leaving out the kvm line is up to you
 
I'm running 6.2-12 of ProxMox. Just tried to update/upgrade and all is current.
Have not considered supplementing the bios.... should I have to do that after the upgrade?

Will add iommu= in grub as well to see if that makes a difference.

My bad.. in modprobe.d/vfio.conf:

options vfio-pci ids=10de:1c03,10de:10f1 disable_vga=1
 
I'm running 6.2-12 of ProxMox. Just tried to update/upgrade and all is current.
Have not considered supplementing the bios.... should I have to do that after the upgrade?

Will add iommu= in grub as well to see if that makes a difference.

My bad.. in modprobe.d/vfio.conf:

options vfio-pci ids=10de:1c03,10de:10f1 disable_vga=1

check for

pvx kernel: iommu: Default domain type: Passthrough

in the dmesg output with: dmesg | grep Passthrough
 
using dmesg | grep Passthrough returns nothing.

you may need to search /var/log/* instead, dmesg may be overflowing and not showing anything

if you setiommu=pt then you should get: iommu: Default domain type: Passthrough (set via kernel command line)

typically i also check on iommu in dmesg or /var/log/*
there should be quite a bit of messages

do you know how to validate the grub configuration is booting as configured ?
i learned the hard way proxmox uses pve-efiboot-tool refresh not update-grub or other
 
Well, added in "iommu=pt" and ran dmesg | grep Passthrough

[ 1.725746] iommu: Default domain type: Passthrough (set via kernel command line)

It starts to boot to through the GPU, then is redirected to the console.
 
Hello Joris!

Sorry for the delay in responding. My PSU went bad and just got another one installed. Whew.

GPU passthrough not working. USB devices are, but I think they always have been.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!