GPU passthrough to an Ubuntu Gaming Server

AnotherPVENoob

New Member
Jan 20, 2023
4
0
1
Hi everyone,

I've done everything in my power to solve this using other forum posts / reddit data / troubleshooting, but I feel like I've tried everything to no avail.

Work it introducing VDIs, so figured I'd learn a bit about them and convert my gaming PC to a server; host Proxmox, run a Pihole and do some learning; saw a CraftComputing video that suggested I could even pass through my GPU and continue to game using MoonLight or Steam's remote play.

Sounds great!

I've followed the steps outlined by CraftComputing to no avail, so I figured I'd follow the reddit guide (https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough) to no avail, now I've been through an extra 3 pve forums, 2 more reddit posts, 3 Github issues and a couple of Nvidia forums.

I have a Z270I motherboard and am trying to pass a GTX1070 graphics card through to an Ubuntu 21.10 VM; I've:

  1. Turned on vt-d, included the intel_iommu line in Grub and updated grub
  2. Added the vfio modules
  3. Blacklisted the other drivers from PVE directly
  4. Identified my GPU and added it to VFIO per "options vfio-pci ids=10de:1b81,10de:10f0 disable_vga=1"
  5. Updated initramfs
  6. Added the various flags to the machines config file 1674188003259.png
  7. Added the PCI Device to the machine
  8. Started and setup Ubuntu 21.10
  9. Confirmed the 1070 appears to the machine
  10. Setup Remmina
  11. Disabled the Display of the machine
I then tried connecting to the machine via remmina and could not access it: simply says cannot connect to the 10.0.0.6 RDP Server.

So I've then tried, per various bits of advice:
  • Turning 'All Functions' and 'Primary GPU' on and off, both individually and as a group
  • Buying a DP dummy plug for the back of the machine
  • Added a ROM-File to the machine
  • Tried running Nvidia-SMI; which simply gives the error 'no devices found', OR 'NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver, Make sure that {it} is installed and running
  • Tried running Nvidia.persistence.d, which says it failed to initialize, and to check the syslog
  • Checked various logs which have sent me down rabbit holds of RmInitAdapter failed and creating a startup script per some other forum: 1674188652355.png1674188695963.png

I understand this is a birdsnest of various fixes and attempts, but I am happy to start again creating the machine if need be (won't be the first, second, third or fourth time), but if anyone has any other suggestions as to what might be being done wrong ti will be appreciated.

Happy to provide any logs if necessary, but bear in mind I can only get stuff out of the machine when I boot it with a screen output.

Any and all help appreciated!!
 
  1. Turned on vt-d, included the intel_iommu line in Grub and updated grub
Are you sure your Proxmox uses GRUB? What is the output of cat /proc/cmdline on the Proxmox host?
I then tried connecting to the machine via remmina and could not access it: simply says cannot connect to the 10.0.0.6 RDP Server.
Is the VM running with 1 virtual CPU at 100% and doing nothing? Setting up a virtual serial console (requires setup inside the VM also) can help a lot on troubleshooting a Linux VM.
What are the Syslog or journaltctl messages when starting the VM. Those might be a clue to what is going wrong.
So I've then tried, per various bits of advice:
Most of those steps don't do anything unless you can already see the GPU inside the VM with the lspci command (inside the Linux VM).

EDIT: It is possible that adding the passthrough device changes the name of the network device inside the VM, which might explain why you cannot connect to it.
 
Last edited:
Are you sure your Proxmox uses GRUB? What is the output of cat /proc/cmdline on the Proxmox host?

Is the VM running with 1 virtual CPU at 100% and doing nothing? Setting up a virtual serial console (requires setup inside the VM also) can help a lot on troubleshooting a Linux VM.
What are the Syslog or journaltctl messages when starting the VM. Those might be a clue to what is going wrong.

Most of those steps don't do anything unless you can already see the GPU inside the VM with the lspci command (inside the Linux VM).

EDIT: It is possible that adding the passthrough device changes the name of the network device inside the VM, which might explain why you cannot connect to it.
  1. Output of cat /proc/cmdline:1674207512515.png Apologies - I don't know enough to be able to comment on whether Proxmox uses grub - I don't even know of any alternatives to grub (I followed the tutorial instructions which seemed to suggest it)
  2. After two minute of boot with the display output set to None; CPU usage and RAM usage were both at about 20%.1674207883213.png
  3. Output of syslog | grep nvidia, as well as syslog.1 | grep nvidia (I assume the .1 simply logs the previous boot?)1674208551815.png1674208639307.png
  4. lspci does actually list the graphics card under 01:00.01674208725634.pngI thought there was a good chance you we're correct with the network being different,though it still shows as available on 10.0.0.6 on my router?1674209014648.png
I'm working on the serial port, but that might take a little longer.
 
  1. Output of cat /proc/cmdline:View attachment 45844 Apologies - I don't know enough to be able to comment on whether Proxmox uses grub - I don't even know of any alternatives to grub (I followed the tutorial instructions which seemed to suggest it)
The Proxmox manual describes how to check if your system uses systemd-boot or GRUB. In my opinion, the official manual for PCI(e) passthrough trumps any tutorial.
  1. After two minute of boot with the display output set to None; CPU usage and RAM usage were both at about 20%.
Sounds like it is stuck during boot with 100% on 1 virtual CPU (out of four).
  1. Output of syslog | grep nvidia, as well as syslog.1 | grep nvidia (I assume the .1 simply logs the previous boot?)
Either use the Sysgog in the Proxmox GUI or journalctl on the command line on the Proxmox host. Grepping nvidia and /var/logsyslog is not helpful.
  1. lspci does actually list the graphics card under 01:00.0View attachment 45848
How are you connecting to the VM? I though you could not connect to it with passthrough? It looks like passthrough (from the Proxmox side) is working fine.
  1. I thought there was a good chance you we're correct with the network being different,though it still shows as available on 10.0.0.6 on my router?
That does not prove anything. Check the network connection from inside the VM (to which you appear to be able to login).
I'm working on the serial port, but that might take a little longer.
You might not need it. I was under the impression that you could not login to the VM with passthrough, but it appears that that's not the problem.
 
Please let me double check: You can start the VM with passthrough and it mostly works? Except that you cannot connect to it via the network? And the GPU does not appear to work (but is present in the VM)?

Is your GPU the only GPU or the GPU used during boot of the Proxmox host? Then you can probable need this simple work-around: add initcall_blacklist=sysfb_init to the kernel parameters on the Proxmox host. No need to reset the GPU via scripts but do early bind it to vfio-pci.

For the network: what is the output of ip a and ip r inside the VM and how did you configure the network inside the VM?
 
A few different things for me to reply to, but I want to start with the fundamental question I think you're asking:

You can start the VM with passthrough and it mostly works? Except that you cannot connect to it via the network? And the GPU does not appear to work (but is present in the VM)?

Is your GPU the only GPU or the GPU used during boot of the Proxmox host? Then you can probable need this simple work-around: add initcall_blacklist=sysfb_init to the kernel parameters on the Proxmox host. No need to reset the GPU via scripts but do early bind it to vfio-pci.

The GPU does appear to be present in the VM, but I cannot get any functionality working at all; which I guess raises the question of is there no possibility this is a configuration issue on PVE's side and is instead something that needs to be fixed in the VM with Ubuntu and Nvidia? If so, I've definitely started this in the wrong forum and apologise.

To answer the rest of this question, I can connect to the network and connect to the VM from my main PC using Remmina only if I boot with VirtIO-GPU; if I boot with Display: None, I cannot seem to connect to it at all (though my router does say it's still there).
1674260140854.png
Network was simply virtio with vmbr0 bridge - standard settings I believe. Regardless: ip a and ip r as requested:

1674260238577.png

The Proxmox manual describes how to check if your system uses systemd-boot or GRUB. In my opinion, the official manual for PCI(e) passthrough trumps any tutorial.
Agreed, and you are correct; I'm assuming it boots with Systemd-Boot.1674283340025.png

Either use the Sysgog in the Proxmox GUI or journalctl on the command line on the Proxmox host. Grepping nvidia and /var/logsyslog is not helpful.
OH - on the HOST - right, hadn't thought of that.

I'm noticing a message regarding 01:00.0 (the graphics card) saying it cannot reserve memory? Is this likely anything? Is it talking about Ram or Video Memory?

1674283659596.png

Thanks for you patience with this - trying to give as much info as I can.
 
I'm noticing a message regarding 01:00.0 (the graphics card) saying it cannot reserve memory? Is this likely anything? Is it talking about Ram or Video Memory?
That is the clue to your issue. It's the error message of the thread I pointed you too (with the orange link in the text) and that provides the work-around.
Is your GPU the only GPU or the GPU used during boot of the Proxmox host? Then you can probable need this simple work-around: add initcall_blacklist=sysfb_init to the kernel parameters on the Proxmox host. No need to reset the GPU via scripts but do early bind it to vfio-pci.
Check with cat /proc/cmdline after adding initcall_blacklist=sysfb_init if it is applied. Note that you won't have a console or see boot messages on your Proxmox host. Also, do enable Primary GPU for NVidia GPUs (and then the Display setting is automatically ignored as if it was None).
 
You are a legend; the orange link spoke to making a change to GRUB, but I updated /etc/kernel/cmdline with that given you're earlier link to the official documents (thank you).

I now have a different error message:

cat /proc/cmdline gives the desired outcome I believe:

1674346779111.png

... but I now get:

1674346813549.png

... on the booting of my machine.

I tried removing the custom romfile, which didn't fix the issue, and tried both rombar=1 and rombar=0, which unfortunately also didn't help.

I've seen some other threads suggesting this could be a bios update required or needing to change a bios setting (https://forum.proxmox.com/threads/pve7-vfio-pci-xxxx-xx-xx-x-no-more-image-in-the-pci-rom.108189/ and (https://forum.level1techs.com/t/no-more-image-in-the-pci-rom/162244), but:

A) Do you read the situation any differently / have another solution?
B) Is there any way to access the host machines bios without running my monitor and keyboard down to the actual machine?

Once again - thanks for your help; I'm learning this slowly.
 
I've seen some other threads suggesting this could be a bios update required or needing to change a bios setting (https://forum.proxmox.com/threads/pve7-vfio-pci-xxxx-xx-xx-x-no-more-image-in-the-pci-rom.108189/ and (https://forum.level1techs.com/t/no-more-image-in-the-pci-rom/162244), but:

A) Do you read the situation any differently / have another solution?
I have no experience with NVidia GPUs (because they used to break passthrough on purpose). There are threads about ROM patching for NVidia GPUs, maybe you need something like that.
B) Is there any way to access the host machines bios without running my monitor and keyboard down to the actual machine?
Server motherboard use IPMI. Or maybe you have a PiKVM setup? Since passthrough can easily break your Proxmox (and updates can also break it sometimes), I suggest keeping a keyboard and monitor attached. Also keep a very simple GPU installed for Proxmox, which also prevents the "single GPU passthrough" problems.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!