GPU Passthrough Issues on kernel 5.15

photonate

Member
Jan 21, 2021
19
4
8
33
Hey all,

I have a Proxmox server running on kernel 5.13.19-4 where I have my GPU passthrough working perfectly. I am interested in getting ahead of the release of the 5.15 kernel, and I have opted in and gotten everything installed. However, on this newer 5.15 kernel I have only ever received the BAR 0: can't reserve memory error when booting the VM.

I have seen that adding
Code:
video=simplefb:off
to the GRUB_CMDLINE_LINUX_DEFAULT line, both as the only video option and in addition to the
NGINX:
nofb nomodeset video=vesafb:off efifb:of
line that is required for my 5.13 kernel to let the GPU slide through, but I have not had any luck. I don't get any different errors no matter what I try, just the BAR 0: can't reserve memory error.

My Proxmox box is a Dell T7810, running two Intel Xeon E5-2690V3 with 128GB of DDR4 ECC RAM, and I am trying to pass through an Nvidia 1650 Super. Again, I have everything working on kernel 5.13, but I haven't found the right changes to make to upgrade to 5.15.

If possible, if anyone is successfully passing through a GPU on kernel 5.15, I would love to see your GRUB_CMDLINE_LINUX_DEFAULT options.

Thanks!
 
if you really want to know, mine looks like GRUB_CMDLINE_LINUX_DEFAULT="quiet".
Note that this nofb nomodeset video=vesafb:off efifb:of is not correct and should be nofb nomodeset video=vesafb:off video-efifb:off.

I had the same issue with kernel 5.15 with a RX570 GPU. simplefb was introduced by kernel 5.15, which took over the console from efifb (I was using video=efifb:off). A work-around was to add video=simplefb:off.

Eventually, I remove all those settings and let amdgpu driver handle it (by no longer blacklisting), which works fine for me but that won't work for your NVidia GPU.
 
Thank you! Embarrassingly I had the correct nofb nomodeset video=vesafb:off video-efifb:off in my GRUB_CMDLINE_LINUX_DEFAULT, but in my laziness I posted the line from the tutorial I used, where it was incorrect. I completely forgot about the day that it took me to find the correction before and get it running the first time.

I am still getting the BAR 0: can't reserve memory error when I try the 5.15 kernel. I am wondering what else may be needed to move up to the 5.15 kernel. Right now, if my understanding is correct, I should be good with GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=simplefb:off" but alas.

I should also mention that this is obviously not a production environment, and I am really just diving into this issue to try and improve my understanding of virtualization and Linux. I am re-reading the Prox Wiki pages related to this topic, but would appreciate any other advice or resources.
 
Last edited:
Make sure you are using the latest kernel 5.15 as there have been improvements over time. My suggestion was to use video=simplefb:off together with video=vesafb:off video=efifb:off. Have you tried that?
 
I believe I should be completely up to date, I'm trying with the Linux 5.15.30-2-pve #1 SMP PVE 5.15.30-3 (Fri, 22 Apr 2022 18:08:27 +0200) as well as pve-manager/7.1-12/b3c09de3 and my GRUB line looks like this GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt video=simplefb:off video=efifb:off video=vesafb:off"

I have also tried almost every combination of adding the video=simplefb:off while removing or just moving other parts of the line, but I'm not sure I'm seeing a difference. Still just endless BAR 0: can't reserve memory errors.
 
Some driver or device appears to be claiming the memory. Can you please show the whole BAR 0 error message including the memory range? What does cat /proc/iomem show?
 
Of course, thank you for taking the time to help me dive into this.

My kern.log shows this
Apr 29 05:54:33 prox kernel: [ 126.700860] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700877] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700890] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700903] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700918] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700931] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700944] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700957] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700970] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref] Apr 29 05:54:33 prox kernel: [ 126.700986] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]

and then

root@prox:~# cat /proc/iomem 00000000-00000fff : Reserved 00001000-0003efff : System RAM 0003f000-0003ffff : Reserved 00040000-0009ffff : System RAM 000a0000-000bffff : PCI Bus 0000:00 000f0000-000fffff : System ROM 00100000-d4e46fff : System RAM d4e47000-d4e79fff : Reserved d4e7a000-d90dffff : System RAM d90e0000-d9c79fff : Reserved d9c7a000-d9ce0fff : ACPI Tables d9ce1000-da95dfff : ACPI Non-volatile Storage da95e000-daff3fff : Reserved daff4000-db168fff : Unknown E820 type db169000-db169fff : System RAM db16a000-db177fff : Reserved db178000-dcffffff : System RAM dd000000-ddffffff : Reserved de000000-dfffffff : RAM buffer e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff] e0000000-efffffff : Reserved f0000000-f3ffbfff : PCI Bus 0000:00 f0000000-f01fffff : PCI Bus 0000:04 f0200000-f07fffff : PCI Bus 0000:02 f0200000-f027ffff : 0000:02:00.0 f0280000-f02bffff : 0000:02:00.0 f0280000-f02bffff : vfio-pci f02c0000-f06bffff : 0000:02:00.0 f06c0000-f06c3fff : 0000:02:00.0 f06c0000-f06c3fff : vfio-pci f06c4000-f0703fff : 0000:02:00.0 f2000000-f30fffff : PCI Bus 0000:03 f2000000-f2ffffff : 0000:03:00.0 f2000000-f2ffffff : vfio-pci f3000000-f307ffff : 0000:03:00.0 f3080000-f3083fff : 0000:03:00.1 f3080000-f3083fff : vfio-pci f3084000-f3084fff : 0000:03:00.3 f3084000-f3084fff : vfio-pci f3100000-f32fffff : PCI Bus 0000:01 f3100000-f31fffff : 0000:01:00.0 f3200000-f32fffff : 0000:01:00.0 f3200000-f32fffff : mlx4_core f3400000-f341ffff : 0000:00:19.0 f3400000-f341ffff : e1000e f3420000-f342ffff : 0000:00:14.0 f3420000-f342ffff : xhci-hcd f3430000-f34307ff : 0000:00:1f.2 f3430000-f34307ff : ahci f3431000-f34313ff : 0000:00:1d.0 f3431000-f34313ff : ehci_hcd f3432000-f34323ff : 0000:00:1a.0 f3432000-f34323ff : ehci_hcd f3433000-f3433fff : 0000:00:19.0 f3433000-f3433fff : e1000e f3435000-f34357ff : 0000:00:11.4 f3435000-f34357ff : ahci f3436000-f3436fff : 0000:00:05.4 f3ffc000-f3ffcfff : dmar2 f3ffd000-f3ffdfff : dmar1 f4000000-f7ffbfff : PCI Bus 0000:e0 f7f00000-f7f00fff : 0000:e0:05.4 f7ffc000-f7ffcfff : dmar0 fec00000-fecfffff : PNP0003:00 fec00000-fec003ff : IOAPIC 0 fec01000-fec013ff : IOAPIC 1 fec40000-fec403ff : IOAPIC 2 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fed12000-fed1200f : pnp 00:01 fed12010-fed1201f : pnp 00:01 fed1b000-fed1bfff : pnp 00:01 fed1c000-fed1ffff : Reserved fed1f410-fed1f414 : iTCO_wdt.1.auto fed40000-fed40fff : 00:04 TPM fed45000-fed8bfff : pnp 00:01 fee00000-feefffff : pnp 00:01 fee00000-fee00fff : Local APIC ff000000-ffffffff : Reserved ff000000-ffffffff : pnp 00:01 100000000-201fffffff : System RAM de7400000-de84025bf : Kernel code de8600000-de8ff0fff : Kernel rodata de9000000-de943e4bf : Kernel data de9736000-de9bfffff : Kernel bss 30000000000-33fffffffff : PCI Bus 0000:00 30000000000-300001fffff : PCI Bus 0000:04 33fe0000000-33ff20fffff : PCI Bus 0000:03 33fe0000000-33fefffffff : 0000:03:00.0 33fe0000000-33fe02fffff : BOOTFB 33ff0000000-33ff1ffffff : 0000:03:00.0 33ff0000000-33ff1ffffff : vfio-pci 33ff2000000-33ff203ffff : 0000:03:00.2 33ff2000000-33ff203ffff : vfio-pci 33ff2040000-33ff204ffff : 0000:03:00.2 33ff2040000-33ff204ffff : vfio-pci 33ff2800000-33ff2ffffff : PCI Bus 0000:01 33ff2800000-33ff2ffffff : 0000:01:00.0 33ff2800000-33ff2ffffff : mlx4_core 33ffff00000-33ffff03fff : 0000:00:1b.0 33ffff00000-33ffff03fff : ICH HD audio 33ffff05000-33ffff050ff : 0000:00:1f.3 33ffff07000-33ffff0700f : 0000:00:16.0 33ffff07000-33ffff0700f : mei_me 34000000000-37fffffffff : PCI Bus 0000:e0
 
33fe0000000-33fe02fffff : BOOTFB is the problem. I don't understand why video=vesafb:off video=efifb:off video=simplefb:off won't fix this.
You could try running these commands (in order, before starting the VM) to see if the BOOTFB line in /proc/iomem disappears:
echo 0 | tee /sys/class/vtconsole/vtcon*/bind echo 'vesa-framebuffer.0' >/sys/bus/platform/drivers/vesa-framebuffer/unbind echo 'efi-framebuffer.0' >/sys/bus/platform/drivers/efi-framebuffer/unbind echo 'simple-framebuffer.0' >/sys/bus/platform/drivers/simple-framebuffer/unbind
If they work, the Proxmox host console will disappear, so please run them via SSH from another system.
 
Sorry for the delay; I didn't have as much time to test this at work as I usually do. I am also glad for the sanity check around the GRUB lines being enough to prevent the memory from being occupied. I also again appreciate the help in diagnosing this, I hope this is as good of a brain teaser for you as it has been for me.

Here's how it worked out, apologies in advance as I'm sure the formatting will be a little ugly. I should note that this is with the video=vesafb:off video=efifb:off video=simplefb:off in my GRUB_CMDLINE_LINUX_DEFAULT

Code:
root@prox:~# echo 0 | tee /sys/class/vtconsole/vtcon*/bind
0
root@prox:~# echo 'vesa-framebuffer.0' >/sys/bus/platform/drivers/vesa-framebuffer/unbind
-bash: echo: write error: No such device
root@prox:~# echo 'efi-framebuffer.0' >/sys/bus/platform/drivers/efi-framebuffer/unbind
-bash: echo: write error: No such device
root@prox:~# echo 'simple-framebuffer.0' >/sys/bus/platform/drivers/simple-framebuffer/unbind
-bash: echo: write error: No such device


root@prox:~# tail /var/log/kern.log
Apr 29 16:48:42 prox kernel: [  111.165790] fwbr102i1: port 1(fwln102i1) entered blocking state
Apr 29 16:48:42 prox kernel: [  111.165792] fwbr102i1: port 1(fwln102i1) entered forwarding state
Apr 29 16:48:42 prox kernel: [  111.168995] fwbr102i1: port 2(tap102i1) entered blocking state
Apr 29 16:48:42 prox kernel: [  111.168999] fwbr102i1: port 2(tap102i1) entered disabled state
Apr 29 16:48:42 prox kernel: [  111.169112] fwbr102i1: port 2(tap102i1) entered blocking state
Apr 29 16:48:42 prox kernel: [  111.169115] fwbr102i1: port 2(tap102i1) entered forwarding state
Apr 29 16:48:47 prox kernel: [  116.382808] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Apr 29 16:48:47 prox kernel: [  116.382828] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Apr 29 16:48:47 prox kernel: [  116.384094] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:47 prox kernel: [  116.402672] vfio-pci 0000:03:00.1: enabling device (0100 -> 0102)
root@prox:~# tail /var/log/kern.log
Apr 29 16:48:55 prox kernel: [  124.015138] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015151] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015164] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015176] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015195] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015207] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015220] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015233] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015245] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 16:48:55 prox kernel: [  124.015258] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]

This is with just the IOMMU options in the GRUB

Code:
root@prox:~# echo 0 | tee /sys/class/vtconsole/vtcon*/bind
0
root@prox:~# echo 'vesa-framebuffer.0' >/sys/bus/platform/drivers/vesa-framebuffer/unbind
-bash: echo: write error: No such device
root@prox:~# echo 'efi-framebuffer.0' >/sys/bus/platform/drivers/efi-framebuffer/unbind
-bash: echo: write error: No such device
root@prox:~# echo 'simple-framebuffer.0' >/sys/bus/platform/drivers/simple-framebuffer/unbind

root@prox:~# tail /var/log/kern.log
Apr 29 17:01:20 prox kernel: [   93.977412] fwbr102i1: port 1(fwln102i1) entered blocking state
Apr 29 17:01:20 prox kernel: [   93.977414] fwbr102i1: port 1(fwln102i1) entered forwarding state
Apr 29 17:01:20 prox kernel: [   93.980796] fwbr102i1: port 2(tap102i1) entered blocking state
Apr 29 17:01:20 prox kernel: [   93.980799] fwbr102i1: port 2(tap102i1) entered disabled state
Apr 29 17:01:20 prox kernel: [   93.980912] fwbr102i1: port 2(tap102i1) entered blocking state
Apr 29 17:01:20 prox kernel: [   93.980914] fwbr102i1: port 2(tap102i1) entered forwarding state
Apr 29 17:01:26 prox kernel: [   99.286902] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
Apr 29 17:01:26 prox kernel: [   99.286929] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Apr 29 17:01:26 prox kernel: [   99.288228] vfio-pci 0000:03:00.0: BAR 1: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:26 prox kernel: [   99.306735] vfio-pci 0000:03:00.1: enabling device (0100 -> 0102)
root@prox:~# tail /var/log/kern.log
Apr 29 17:01:31 prox kernel: [  105.009989] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010003] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010016] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010028] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010044] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010060] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010075] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010093] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010105] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
Apr 29 17:01:31 prox kernel: [  105.010120] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0x33fe0000000-0x33fefffffff 64bit pref]
root@prox:~#

Obviously there is a difference in the simplefb either existing or not existing, but I am not smart enough (yet) to really understand what this means.

Again, 5.13 has been rock solid for me. Once I got the passthrough working it has been perfect, except for when I installed my LSI card which changed the PCIe address. I also realized I can remove the pcie_acs_override=downstream,multifunction line from my GRUB and have it working on the 5.13 kernel; so while I haven't accomplished my main goal I still feel productive.


EDIT - I noticed that if I unplug my HDMI dummy plug and boot the device I get no BAR errors, but no output on my virtual screen. I can plug the HDMI dummy plug back in, and get a virtual screen output and no BAR errors. But if I reboot my ProxBox with the dummy plug in, I get the BAR errors.

Now I am doing research on how to get a virtual screen output without the dummy plug.
 
Last edited:
I'm having a similar experience with an nvidia gpu also (1070ti) on latest kernel. (5.15) I've tried the workarounds described above to no avail.
The only thing I've found that works so far is not having it be used for boot i.e. Using another gpu (in my case amdgpu) for boot/host. Passthrough of the nvidia card seems to work fine then.

I suspect the problem started with the switch to simplefb from efifb, more info from this post: https://forum.proxmox.com/threads/o...r-proxmox-ve-7-x-available.100936/post-449787

I think you may be able to compile the pve-kernel with efifb by removing the CONFIG_SYSFB_SIMPLEFB=y option. I have not gotten around to this yet so I'm not sure, though might be something to look into.
 
Yeah, I believe I've seen pretty much every post on this particular issue. I believe that we are the three posters who have talked the most about the process of getting GPUs passed through on the 5.15, with varying degrees of success.

I wish I had a better understanding of the concepts and issues here, but I'm still diving in.

Right now it appears that having any video outputs connected to the GPU being passed through will cause the framebuffer to be locked by the bootloader. By removing that connection, and plugging it in after the VM has loaded, everything works perfectly. It would appear that you are correct, and the options to disable the framebuffer are not working as intended right now.

With Proxmox 7.2 and the 5.15 kernel presumably getting closer to release, I hope there are more opportunities for me to improve my understanding here.
 
I finally figured it out how to get this working. All I had to do were the following:
1. Create a bash job that runs a boot with the follow script

Code:
#!/bin/sh

echo 1 > /sys/bus/pci/devices/0000\:0x\:00.0/remove
echo 1 > /sys/bus/pci/rescan


echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind

echo vesa-framebuffer.0 > /sys/bus/platform/drivers/vesa-framebuffer/unbind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
echo simple-framebuffer.0 > /sys/bus/platform/drivers/simple-framebuffer/unbind

Please replace the device ID to the one that you have for your video card

2. Make sure that GRUB_CMDLINE_LINUX_DEFAULT has efifb:off as part of the options

3. update-grub (I sometimes forget this step but just in case)

Hopefully this should work for you. If you like, here is where I found the solution: https://forum.proxmox.com/threads/problem-with-gpu-passthrough.55918/

Happy passthrough!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!