Passthrough error migrating from proxmox 8 to 9

hal9008 · Sep 12, 2025

I’ve run into a serious problem migrating from Proxmox 8 to Proxmox 9.

My machine specs:

Case: Antec P101 Silent
PSU: Seasonic B12 BM-650 - A651BMAFH - 80 Plus Bronze
Motherboard: Gigabyte GA-Z97X-UD3H (Proxmox reports it as Z97X-UD3H-CF). Released Q2 2014
CPU: Intel(R) Xeon(R) CPU E3-1245 v3 @ 3.40GHz Fill By OEM CPU @ 3.4GHz – 4 Cores, 8 Threads – Released Q2’13
RAM:
- 2x Kingston 99U5471-066.A00LF 8GiB DIMM DDR3 Synchronous 1600 MHz (0.6 ns) in slots 1 and 3
- 2x KVR16N11/8 8GB PC RAM Kingston PC3-12800U DDR3-1600MHz in slots 2 and 4
- Total: 32 GB
Dedicated GPU: Nvidia Quadro P2000 5 GB RAM – Passthrough to VM 101
Integrated GPU: Xeon E3-1200 v3 Processor Integrated Graphics Controller – Passthrough to VM 103
Onboard NIC: Intel Corporation Ethernet Connection I217-V
CPU Audio Controller (HDMI out): Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller
Motherboard Audio Controller (case jacks): Intel 9 Series Chipset Family HD Audio Controller
Disk /dev/sda (1 TB, WDC WDS100T1R0A): Proxmox + multiple Linux OS (96% wearout as of 2025-04-01, 4% used)
Disk /dev/sdb (1 TB, WDC WDS100T1R0A): Virtualmin + Windows Server (92% wearout as of 2025-04-01, 8% used)
Disk /dev/sdc (7.45 TB, WDC WD8003FFBX-6): various – passthrough to VM 101
Disk /dev/sdd (3 TB, WDC WD30EZRX-00M): Proxmox backups + Apple Time Machine

I have three active VMs (101, 102, 103). VM 103 has passthrough of the integrated GPU and works fine.

The problem is with VM 101. I’m trying to passthrough a Quadro P2000, but it’s not working at all. I’ve tried everything. In short:

Switched the VM from BIOS to EFI
Updated Nvidia drivers inside the VM from 535 to 550

My /etc/default/grub on Proxmox looks like this:

Code:

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off intel_iommu=on iommu=pt initcall_blacklist=sysfb_init video=simplefb:off video=vesafb:off video=efifb:off video=vesa:off disable_vga=1 vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1 modprobe.blacklist=nouveau,nvidia,nvidiafb,nvidia-gpu,vesafb,efifb pcie_acs_override=downstream,multifunction pci=noaer"
GRUB_CMDLINE_LINUX=""

And my VM 101 config (101.conf):

Code:

agent: 1
args: -cpu host,kvm=off
bios: ovmf
boot: order=scsi3;scsi0
cores: 3
cpu: host,flags=+aes,hidden=1,hv_vendor_id=proxmox
#cpu: host
efidisk0: discoprincipal:101/vm-101-disk-1.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:01:00.0;0000:01:00.1,pcie=1
machine: pc-q35-9.2
memory: 16384
name: Pi-Hole
net0: virtio=9E:3E:E3:24:90:84,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: discoprincipal:101/vm-101-disk-0.qcow2,discard=on,size=256G
scsi1: TimeMachine:101/vm-101-disk-0.qcow2,backup=0,discard=on,size=1430G
scsi2: /dev/disk/by-id/ata-WDC_WD8003FFBX-68B9AN0_VYHGRLXM,backup=0,discard=on,size=7814026584K
scsi3: discoprincipal:101/vm-101-disk-2.qcow2,discard=on,size=536871K,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=482d392e-32cd-49e5-ab8a-69f3541c194e
sockets: 1
tablet: 1
vmgenid: a1be1ed5-c701-4ddd-a383-cd0d72bcd560

I’ve tried every possible way to hide the GPU from Nvidia’s virtualization detection (even dumping and passing the GPU’s ROM), but nothing works. Inside the VM, I always end up with this:

Code:

marcosms@pi-hole:~/descargas/nvidia-patch-master$ nvidia-smi
No devices were found
marcosms@pi-hole:~/descargas/nvidia-patch-master$ sudo dmesg | grep -i nvidia
[    3.743446] nvidia: loading out-of-tree module taints kernel.
[    3.743455] nvidia: module license 'NVIDIA' taints kernel.
[    3.743459] nvidia: module license taints kernel.
[    3.870654] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[    3.872567] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[    4.112386] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  550.163.01  Tue Apr  8 12:41:17 UTC 2025
[    4.149777] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  550.163.01  Tue Apr  8 12:09:34 UTC 2025
[    4.184175] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[   20.611415] [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[   20.615700] [drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device
[   21.226304] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[   21.323272] nvidia-uvm: Loaded the UVM driver, major device number 235.
marcosms@pi-hole:~/descargas/nvidia-patch-master$ lspci -nnk | grep -A 3 -i nvidia
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106GL [Quadro P2000] [10de:1c30] (rev a1)
        Subsystem: NVIDIA Corporation GP106GL [Quadro P2000] [10de:11b3]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
        Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller [10de:11b3]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
05:01.0 PCI bridge [0604]: Red Hat, Inc. QEMU PCI-PCI bridge [1b36:0001]

I can’t get rid of these two errors:

Code:

[drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[drm:nv_drm_register_drm_device [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device

Which prevents the Nvidia drivers from loading correctly.

This didn’t happen on Proxmox 8. Is there any workaround for this? I refuse to believe I’m the only one facing this issue. Can someone open a ticket or escalate this so it gets fixed in Proxmox 9? It’s very frustrating that something works in the previous version but breaks in the new one.

hal9008 · Sep 12, 2025

I don’t understand what’s going on with passthrough functionality. Can someone explain it to me?

Is this now a premium feature or something like that? Does Nvidia have premium drivers? Why is this happening? I’ve seen several threads both here and elsewhere, and it looks like people are just shooting in the dark without knowing how to solve this issue.

Does what’s happening make any sense? Are the developers aware of these problems and working on solutions, or has this become a paid feature now?

Hopefully someone can clarify this.

dcsapak · Sep 15, 2025

hal9008 said:
I don’t understand what’s going on with passthrough functionality. Can someone explain it to me?

Is this now a premium feature or something like that? Does Nvidia have premium drivers? Why is this happening? I’ve seen several threads both here and elsewhere, and it looks like people are just shooting in the dark without knowing how to solve this issue.

Does what’s happening make any sense? Are the developers aware of these problems and working on solutions, or has this become a paid feature now?

Hopefully someone can clarify this.

Hi,

no, it's not a paid feature. Proxmox VE is 100% open source, and there is no feature gating.

As for why PCI pass-through does not work for you correctly, with the information i can't say exactly, but it's often a bit of trial and error (mostly because of hardware), especially with hardware that is older (you're using consumer grade mainboard, a cpu from 12 years ago and a gpu from 7 years ago).

In your case, i'd try to remove all customizations (kernel commandline, args line from the config, custom patches/packages/etc) and start fresh, posting the the vm config, the full 'dmesg' output of the host and guest, and 'lspci -nnk' output of the host and guest

hal9008 · Sep 16, 2025

First of all, thanks to @dcsapak for taking an interest in this issue. It reassures me to know that it’s not a licensing problem (I was convinced it had to be something along those lines).

These past few days I’ve been trying to solve this matter and came to a few conclusions:

Passthrough of both cards on my system is impossible. I tried countless modifications in the grub file and the files inside modprobe.d, but I couldn’t get passthrough working for the NVIDIA GPU. I tested different driver versions (535, 550, 570, 580), but none of them worked.
Since the GPU is only used by two specific dockers, I tried creating an LXC and running them there (Yes… I know LXC isn’t ideal for running docker, but it was worth a try). I installed the GPU drivers both on the host (Proxmox) and inside the LXC (obviously removing all the parameters from GRUB_CMDLINE_LINUX_DEFAULT first). That way, I managed to get the driver working inside the LXC and the card was recognized.

When running the dockers (Jellyfin and Ollama+OpenWebUI), they detected the GPU, but managed it poorly. The memory got saturated before actually using the containers and never freed up at any point (something that worked perfectly in Proxmox 8). On top of that, I couldn’t keep Intel GPU passthrough working while leaving the NVIDIA card as “no-passthrough” (if one worked, the other wouldn’t at all). Because of this, I chose the easy route: reinstall Proxmox 8 and restore backups of all machines. Now I finally have everything running again.

Still, I’m left with the frustration of not having been able to solve this issue. I understand the hardware I’m using is a bit dated, but believe me, for many tasks the system I’m running is still oversized (I’ve seen much newer setups running similar services at a fraction of the speed of this little server). Right now, with 3 virtual machines running (one Windows Server, another with Virtualmin hosting websites and email, and another with 15 dockers and multiple services of different kinds), I’m at 6% CPU usage and 70% RAM usage. With those conditions, changing hardware is simply not an option.

The fact that with the same EFI configuration and same hardware, passthrough doesn’t work at all in Proxmox 9 while in Proxmox 8 it runs smoothly can only point to “something-I-don’t-know” in version 9 that isn’t right. Especially since I see more people having passthrough issues in this version. My conclusion is that the hardware isn’t defective, nor is it “too old” for passthrough to be possible. I believe the problem comes from how the new software stack in Proxmox 9 (kernel + QEMU + drivers) interprets or handles that hardware, and that’s something the Proxmox developers should review.

dcsapak · Sep 17, 2025

i just want to point out that I'd like to help, but as I already said with the information at hand I cannot.

You did not provide any of the information I asked for (journalctl, dmesg, lspci, etc.) from a clean state, so it's impossible to say why it's working on pve 8 vs pve 9

It might be that there are some regressions in the kernel / qemu that are not reproducible with newer hardware, since it happens that older hardware is not that well tested anymore by kernel/qemu devs. (that's why i mentioned the age)

If you want to solve this for pve 9, please provide the output from the state i asked for (or at least test with e.g. the 6.14 kernel on pve8 & provide the output), otherwise it's impossible for me to help.

hal9008 · Sep 17, 2025

OK, sorry for not providing this in my previous message. I’m not entirely sure if what I’m attaching here is exactly what you need, but I hope it helps.

Right now the server is running Proxmox 8 with everything working perfectly (including passthrough). In the attached compressed file you’ll find:

The output of lspci, dmesg, and journalctl from both the Proxmox host and the VM that uses the NVIDIA card.
System events, device list, and general event logs from the Windows VM (which has the Intel GPU passed through).

It’s possible that in the VM event logs you’ll see plenty of errors between September 11 and 15 (those are the days when I tried the migration).

https://descargas.flopy.es/?r=/download&path=L1BhcmEgb3Ryb3MvRm9ybyBQcm94bW94L2xvZ3Muemlw

If you need any additional data, please just let me know.

If we can identify the root cause and have a clear path to a solution, I’d be very interested in migrating to Proxmox 9 (I prefer not to stay on older software versions if I can move to the newer ones).

dcsapak · Sep 17, 2025

well the most interesting things would be the host logs in a not working state, else i cannot even start to search where the problem might be.

As i said, a start would be to try the 6.14 kernel on the host, if that fails, we have a candidate to look where there might be a regression

hal9008 · Sep 18, 2025

The best approach I can think of to provide you with the required logs is the following:

Proxmox is installed on a single SSD (this is not an enterprise setup, so I’m not using RAID). I could use a disk cloner to copy the whole system drive to another SSD of the same size (I believe I have a spare one at home).
Once cloned, I can install that SSD in the server and check if it boots correctly. If it does, I would then upgrade the cloned system to Proxmox 9.
After the upgrade, I can test passthrough of the GPUs again, and if it fails, I’ll be able to collect the logs you need.
Once finished, I can simply swap back the original drive to quickly restore everything to its working state.

What I need to confirm is exactly which logs you’d like me to gather. If it’s the same set you mentioned before, no problem—I’ll prepare them.

In any case, I probably won’t have time to do this until this weekend or the next one. As soon as I have the data, I’ll post it here.

Thank you very much for the help and support you’re giving me.

dcsapak · Sep 18, 2025

hal9008 said:
What I need to confirm is exactly which logs you’d like me to gather. If it’s the same set you mentioned before, no problem—I’ll prepare them.

i'd like at least the output of

Code:

journalctl -b
dmesg
lspci -nnk

and the start task log

as well as any relevant logs from the guest (also journalctl, dmesg, lspci)

hal9008 said:
In any case, I probably won’t have time to do this until this weekend or the next one. As soon as I have the data, I’ll post it here.

no problem, just update the thread when you have the information

tetsujin · Oct 15, 2025

I was trying everything I could find to solve this. And I'm not going to lie Gemini gave me this answer but it worked. And I thought I'd post it here so that the promox team could understand what's going on better:

Bash:

# SSH to proxmox host
nano /etc/pve/qemu-server/<VMID>.conf

# Add the lines at the bottom
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NvidiaHack'

# Run on the Proxmox host CLI
qm stop <VMID>
qm start <VMID>

According to Gemini:
This is a virtualization lock intended by NVIDIA, which is often bypassed using a simple trick in the QEMU configuration.
The driver is loading, but failing initialization. The fix is to add a flag to your VM's configuration file on the Proxmox host to camouflage the VM and bypass this virtualization check.
This argument forces QEMU to pass through specific CPU flags and replaces the standard KVM vendor ID with a custom string (NvidiaHack), which is often enough to bypass the driver's VM check.

My VM configuration

Code:

Linux jarvis-mkvi 6.14.11-3-pve #1 SMP PREEMPT_DYNAMIC PMX 6.14.11-3 (2025-09-22T10:13Z) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Tue Oct 14 23:05:38 2025 from 192.168.2.159
root@jarvis-mkvi:~# cat /etc/pve/qemu-server/300.conf
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 24
cpu: EPYC-Milan
efidisk0: arc-core:vm-300-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:c5:00,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: q35
memory: 32768
meta: creation-qemu=10.0.2,ctime=1760250153
name: Homer
net0: virtio=BC:24:11:51:43:EE,bridge=vmbr0,firewall=1
numa: 1
onboot: 1
ostype: l26
scsi0: arc-core:vm-300-disk-1,discard=on,iothread=1,size=256G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=92c147e8-277d-4e7f-8378-8d098b332bf0
sockets: 1
startup: order=4,up=10
tags: ai-ml
usb0: host=10c4:ea60
vga: virtio
vmgenid: 2f0f7633-5bf7-4047-a2ff-aa8cbc84a2d4
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NvidiaHack'

I am running a Debian 13 VM and i followed Nvidia's Debian 12 Cuda guide to install everything

Search

Search

Passthrough error migrating from proxmox 8 to 9

hal9008

Member

hal9008

Member

dcsapak

Proxmox Staff Member

hal9008

Member

dcsapak

Proxmox Staff Member

hal9008

Member

dcsapak

Proxmox Staff Member

hal9008

Member

dcsapak

Proxmox Staff Member

tetsujin

Member

We value your privacy