Can't get Nvidia GPU passthrough to work on LXC

fabian69420

New Member
Jun 18, 2026
14
0
1
Hey there, I recently added an NVidia RTX 3060 ti GPU to my working Proxmox server and I wanted to pass it through to my Debian LXC container to use with Jellyfin for 4k films.

How I have it configged right now: Proxmox running a Debian 13 LXC Container (unpriviledged), which runs a docker container containing Jellyfin.


I tried to follow the guide over here: https://jellyfin.org/docs/general/post-install/transcoding/hardware-acceleration/nvidia
Tried the Debian one and the Virtual environment one, but when I:

Code:
# modprobe nvidia

modprobe: FATAL: Module nvidia-current not found in directory /lib/modules/7.0.2-6-pve

modprobe: ERROR: Error running install command 'modprobe -i nvidia-current ' for module nvidia: retcode 1

modprobe: ERROR: could not insert 'nvidia': Invalid argument


# nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

so it appears not to work, even though i downloaded the nvidia-driver package (even went through the trouble of setting up the auto-detect thing just to be sure) and the linux-header-amd64. oh- also the errors above i get on the Proxmox host as well as the debian container

I honestly don't know what else to add here, so if there's anything I need to add in order to receive help with this, please let me know.
 
hey, thanks for reaching out! i tried to use your guide, specifically the Nvidia-specific part of the GPU passthrough section. but still no worky. and i ran a lot of commands on the host as well as the container bc it wasnt clear to me where i should use them, so often i just tried both to see what sticks
 
Let's start with this from the node side
Bash:
lspci -vnnk | awk '/VGA/{print $0}' RS=
nvidia-smi
pct config CTIDHERE
 
this is what i get. also i used to think node meant CT/VM, so i re-checked but apparently the node is the host right? anyhow this is on the host:

Code:
# lspci -vnnk | awk '/VGA/{print $0}' RS=
nvidia-smi
pct config CTIDHERE
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] [10de:2486] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Device [174b:a632]
        Flags: fast devsel, IRQ 255, IOMMU group 12
        Memory at f5000000 (32-bit, non-prefetchable) [size=16M]
        Memory at fa00000000 (64-bit, prefetchable) [size=8G]
        Memory at fc00000000 (64-bit, prefetchable) [size=32M]
        I/O ports at f000 [size=128]
        Expansion ROM at f6000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, IntMsgNum 0
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Physical Resizable BAR
        Capabilities: [c1c] Physical Layer 16.0 GT/s <?>
        Capabilities: [d00] Lane Margining at the Receiver
        Capabilities: [e00] Data Link Feature <?>
        Kernel driver in use: vfio-pci
        Kernel modules: nvidia
07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c4) (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device [1043:8877]
        Flags: bus master, fast devsel, latency 0, IRQ 81, IOMMU group 15
        Memory at fc10000000 (64-bit, prefetchable) [size=256M]
        Memory at fc20000000 (64-bit, prefetchable) [size=2M]
        I/O ports at d000 [size=256]
        Memory at f6500000 (32-bit, non-prefetchable) [size=512K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, IntMsgNum 0
        Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=4 Masked-
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [270] Secondary PCI Express
        Capabilities: [2a0] Access Control Services
        Capabilities: [2b0] Address Translation Service (ATS)
        Capabilities: [2c0] Page Request Interface (PRI)
        Capabilities: [2d0] Process Address Space ID (PASID)
        Capabilities: [410] Physical Layer 16.0 GT/s <?>
        Capabilities: [450] Lane Margining at the Receiver
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

400 Parameter verification failed.
vmid: type check ('integer') failed - got 'CTIDHERE'
pct config <vmid> [OPTIONS]

it can find both the dedicated Nvidia GPU and the intergrated AMD GPU. but the AMD one is irrelevant for this as its the nvidia card i wanna use
 
Kernel driver in use: vfio-pci
Seems like you configured it for PCI(e) passthrough or gave it to a VM. You need to reverse that for this to work. The driver should be nvidia.
CTID is a placeholder. Use your CT's ID here.
 
i removed the vfio-pci through
Code:
modprobe -r vfio-pci

then it became this:
Code:
01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] (rev a1)
        Subsystem: PC Partner Limited / Sapphire Technology Device a632
        Kernel modules: nvidia

but after a reboot it becomes:
Code:
01:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)
        Subsystem: PC Partner Limited / Sapphire Technology Device a632
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel


also:

Code:
# pct config 103
arch: amd64
cores: 6
features: nesting=1
hostname: media
memory: 4096
mp0: /mnt/usbstick,mp=/mnt/usbstick
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.2.254,hwaddr=BC:24:11:33:11:54,ip=192.168.2.207/24,type=veth
onboot: 1
ostype: debian
rootfs: media:103/vm-103-disk-0.raw,size=3500G
swap: 2048
unprivileged: 1
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.idmap: u 0 100000 65536
lxc.idmap: g 0 100000 44
lxc.idmap: g 44 44 1
lxc.idmap: g 45 100045 62
lxc.idmap: g 107 993 1
lxc.idmap: g 108 100108 65428
 
vfio-pci is not used without being confgured to do so. Currenty your system is set up for PCI(e) (VM) passthrough. You need to reverse your steps to use it for CTs. Share this
Bash:
find /etc/modprobe.d/* -exec tail -n+1 {} +
tail -n+1 /proc/cmdline /etc/kernel/cmdline /etc/default/grub
grep -sR "hostpci" /etc/pve
The modern way is via dev:, not lxc.mount.entry:. Also please don't use file based disks if you can avoid it.
 
Last edited:
yeah i did configure vfio-pci myself with a guide that told me to to make the passthrough work, but clearly it doesnt work and i cant remember how i configged it :c

Bash:
# find /etc/modprobe.d/* -exec tail -n+1 {} +
tail -n+1 /proc/cmdline /etc/kernel/cmdline /etc/default/grub
grep -sR "hostpci" /etc/pve
==> /etc/modprobe.d/blacklist.conf <==
blacklist nouveau
blacklist nvidia
blacklist nvidiafb
blacklist nvidia_drm

==> /etc/modprobe.d/dkms.conf <==
# modprobe information used for DKMS modules
#
# This is a stub file, should be edited when needed,
# used by default by DKMS.

==> /etc/modprobe.d/intel-microcode-blacklist.conf <==
# The microcode module attempts to apply a microcode update when
# it autoloads.  This is not always safe, so we block it by default.
blacklist microcode

==> /etc/modprobe.d/nvidia-blacklists-nouveau.conf <==
# You need to run "update-initramfs -u" after editing this file.

# see #580894
blacklist nouveau

==> /etc/modprobe.d/nvidia.conf <==
install nvidia modprobe -i nvidia-current $CMDLINE_OPTS

install nvidia-modeset modprobe nvidia ; modprobe -i nvidia-current-modeset $CMDLINE_OPTS

install nvidia-drm modprobe nvidia-modeset ; modprobe -i nvidia-current-drm $CMDLINE_OPTS

install nvidia-uvm modprobe nvidia ; modprobe -i nvidia-current-uvm $CMDLINE_OPTS

install nvidia-peermem modprobe nvidia ; modprobe -i nvidia-current-peermem $CMDLINE_OPTS

# unloading needs the internal names (i.e. upstream's names, not our renamed files)

remove nvidia modprobe -r -i nvidia-drm nvidia-modeset nvidia-peermem nvidia-uvm nvidia

remove nvidia-modeset modprobe -r -i nvidia-drm nvidia-modeset


alias char-major-195* nvidia

# These aliases are defined in *all* nvidia modules.
# Duplicating them here sets higher precedence and ensures the selected
# module gets loaded instead of a random first match if more than one
# version is installed. See #798207.
alias   pci:v000010DEd00000E00sv*sd*bc04sc80i00*        nvidia
alias   pci:v000010DEd00000AA3sv*sd*bc0Bsc40i00*        nvidia
alias   pci:v000010DEd*sv*sd*bc03sc02i00*               nvidia
alias   pci:v000010DEd*sv*sd*bc03sc00i00*               nvidia

==> /etc/modprobe.d/nvidia-options.conf <==
#options nvidia-current NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=44 NVreg_DeviceFileMode=0660

# To grant performance counter access to unprivileged users, uncomment the following line:
#options nvidia-current NVreg_RestrictProfilingToAdminUsers=0

# Uncomment to enable this power management feature:
#options nvidia-current NVreg_PreserveVideoMemoryAllocations=1

# Uncomment to enable this power management feature:
#options nvidia-current NVreg_EnableS0ixPowerManagement=1

==> /etc/modprobe.d/pve-blacklist.conf <==
# This file contains a list of modules which are not supported by Proxmox VE

# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

==> /etc/modprobe.d/vfio.conf <==
options vfio-pci ids=10de:2486,10de:228b disable_vga=1

==> /etc/modprobe.d/zfs.conf <==
options zfs zfs_arc_max=819986432
==> /proc/cmdline <==
BOOT_IMAGE=/boot/vmlinuz-7.0.2-6-pve root=/dev/mapper/pve-root ro quiet amd_iommu=on iommu=pt

==> /etc/kernel/cmdline <==
amd_iommu=on iommu=pt

==> /etc/default/grub <==
# If you change this file or any /etc/default/grub.d/*.cfg file,
# run 'update-grub' afterwards to update /boot/grub/grub.cfg.
# For full documentation of the options in these files, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`( . /etc/os-release && echo ${NAME} )`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""

# If your computer has multiple operating systems installed, then you
# probably want to run os-prober. However, if your computer is a host
# for guest OSes installed via LVM or raw disk devices, running
# os-prober can cause damage to those guest OSes as it mounts
# filesystems to look for things.
#GRUB_DISABLE_OS_PROBER=false

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE/GOP/UGA
# you can see them in real GRUB with the command `videoinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
 
Not trying to undercut the assistance which users have provided here, but having gone through the process of manually passing GPU's to both VM's and CT's, I've started to take the easy path and use the ProxmenuX scripts, which allows you to switch the GPU between VM & CT with a single option from the menu. The package also includes scripts for drives, shares management and many of the other configs which are not changed very often on Proxmox.

Install the script on the host from here: https://github.com/MacRimi/ProxMenux
 
Last edited:
Not trying to undercut the assistance which users have provided here, but having gone through the process of manually passing GPU's to both VM's and CT's, I've started to take the easy path and use the ProxmenuX scripts, which allows you to switch the GPU between VM & CT with a single option from the menu. The package also includes scripts for drives, shares management and many of the other configs which are not changed very often on Proxmox.

Install the script on the host from here: https://github.com/MacRimi/ProxMenux
hey thanks for this script, it seems rlly helpful. but it unfortunatly didnt work for me. when i try to add the GPU to the LXC it gives me this:

1781981951726.png

and when igo to install the drivers on host through the option on the menu, it acts like it's done, but this error reappears anyway
 
I recommend against such scripts that do everything (and even more unrelated stuff) in the background. You just create a black box you can't troubleshoot.
I also can't help properly if your environment changes while I give support. Try this
Bash:
rm -f /etc/modprobe.d/blacklist.conf /etc/modprobe.d/vfio.conf
update-initramfs -ukall
Reboot, then share the output of the first two commands from #4 again.
 
Last edited:
i did the thing, rebooted and this is the output:

Bash:
~# find /etc/modprobe.d/* -exec tail -n+1 {} +
tail -n+1 /proc/cmdline /etc/kernel/cmdline /etc/default/grub
==> /etc/modprobe.d/dkms.conf <==
# modprobe information used for DKMS modules
#
# This is a stub file, should be edited when needed,
# used by default by DKMS.

==> /etc/modprobe.d/intel-microcode-blacklist.conf <==
# The microcode module attempts to apply a microcode update when
# it autoloads.  This is not always safe, so we block it by default.
blacklist microcode

==> /etc/modprobe.d/nouveau-blacklist.conf <==
blacklist nouveau
options nouveau modeset=0

==> /etc/modprobe.d/nvidia-blacklists-nouveau.conf <==
# You need to run "update-initramfs -u" after editing this file.

# see #580894
blacklist nouveau

==> /etc/modprobe.d/nvidia.conf <==
install nvidia modprobe -i nvidia-current $CMDLINE_OPTS

install nvidia-modeset modprobe nvidia ; modprobe -i nvidia-current-modeset $CMDLINE_OPTS

install nvidia-drm modprobe nvidia-modeset ; modprobe -i nvidia-current-drm $CMDLINE_OPTS

install nvidia-uvm modprobe nvidia ; modprobe -i nvidia-current-uvm $CMDLINE_OPTS

install nvidia-peermem modprobe nvidia ; modprobe -i nvidia-current-peermem $CMDLINE_OPTS

# unloading needs the internal names (i.e. upstream's names, not our renamed files)

remove nvidia modprobe -r -i nvidia-drm nvidia-modeset nvidia-peermem nvidia-uvm nvidia

remove nvidia-modeset modprobe -r -i nvidia-drm nvidia-modeset


alias char-major-195* nvidia

# These aliases are defined in *all* nvidia modules.
# Duplicating them here sets higher precedence and ensures the selected
# module gets loaded instead of a random first match if more than one
# version is installed. See #798207.
alias   pci:v000010DEd00000E00sv*sd*bc04sc80i00*        nvidia
alias   pci:v000010DEd00000AA3sv*sd*bc0Bsc40i00*        nvidia
alias   pci:v000010DEd*sv*sd*bc03sc02i00*               nvidia
alias   pci:v000010DEd*sv*sd*bc03sc00i00*               nvidia

==> /etc/modprobe.d/nvidia-options.conf <==
#options nvidia-current NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=44 NVreg_DeviceFileMode=0660

# To grant performance counter access to unprivileged users, uncomment the following line:
#options nvidia-current NVreg_RestrictProfilingToAdminUsers=0

# Uncomment to enable this power management feature:
#options nvidia-current NVreg_PreserveVideoMemoryAllocations=1

# Uncomment to enable this power management feature:
#options nvidia-current NVreg_EnableS0ixPowerManagement=1

==> /etc/modprobe.d/pve-blacklist.conf <==
# This file contains a list of modules which are not supported by Proxmox VE

# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

==> /etc/modprobe.d/zfs.conf <==
options zfs zfs_arc_max=819986432
==> /proc/cmdline <==
BOOT_IMAGE=/boot/vmlinuz-7.0.2-6-pve root=/dev/mapper/pve-root ro quiet amd_iommu=on iommu=pt

==> /etc/kernel/cmdline <==
amd_iommu=on iommu=pt

==> /etc/default/grub <==
# If you change this file or any /etc/default/grub.d/*.cfg file,
# run 'update-grub' afterwards to update /boot/grub/grub.cfg.
# For full documentation of the options in these files, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`( . /etc/os-release && echo ${NAME} )`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
GRUB_CMDLINE_LINUX=""

# If your computer has multiple operating systems installed, then you
# probably want to run os-prober. However, if your computer is a host
# for guest OSes installed via LVM or raw disk devices, running
# os-prober can cause damage to those guest OSes as it mounts
# filesystems to look for things.
#GRUB_DISABLE_OS_PROBER=false

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE/GOP/UGA
# you can see them in real GRUB with the command `videoinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
 
oh also, after doing that the container where im trying to get GPU passthrough doesnt start anymore.

Code:
run_buffer: 569 Script exited with status 255
lxc_init: 1037 Failed to run lxc.hook.pre-start for container "103"
__lxc_start: 2208 Failed to initialize container "103"
TASK ERROR: startup for container '103' failed
 
oh, my bad. i thought you meant the other commands. here are the commands from #4:

Bash:
# lspci -vnnk | awk '/VGA/{print $0}' RS=
nvidia-smi
pct config 103
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] [10de:2486] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Device [174b:a632]
        Flags: fast devsel, IRQ 255, IOMMU group 12
        Memory at f5000000 (32-bit, non-prefetchable) [size=16M]
        Memory at fa00000000 (64-bit, prefetchable) [size=8G]
        Memory at fc00000000 (64-bit, prefetchable) [size=32M]
        I/O ports at f000 [size=128]
        Expansion ROM at f6000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, IntMsgNum 0
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Physical Resizable BAR
        Capabilities: [c1c] Physical Layer 16.0 GT/s <?>
        Capabilities: [d00] Lane Margining at the Receiver
        Capabilities: [e00] Data Link Feature <?>
        Kernel modules: nvidia
07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c4) (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device [1043:8877]
        Flags: bus master, fast devsel, latency 0, IRQ 81, IOMMU group 15
        Memory at fc10000000 (64-bit, prefetchable) [size=256M]
        Memory at fc20000000 (64-bit, prefetchable) [size=2M]
        I/O ports at d000 [size=256]
        Memory at f6500000 (32-bit, non-prefetchable) [size=512K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, IntMsgNum 0
        Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=4 Masked-
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [270] Secondary PCI Express
        Capabilities: [2a0] Access Control Services
        Capabilities: [2b0] Address Translation Service (ATS)
        Capabilities: [2c0] Page Request Interface (PRI)
        Capabilities: [2d0] Process Address Space ID (PASID)
        Capabilities: [410] Physical Layer 16.0 GT/s <?>
        Capabilities: [450] Lane Margining at the Receiver
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

arch: amd64
cores: 6
features: nesting=1
hostname: media
memory: 4096
mp0: /mnt/usbstick,mp=/mnt/usbstick
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.2.254,hwaddr=BC:24:11:33:11:54,ip=192.168.2.207/24,type=veth
onboot: 1
ostype: debian
rootfs: media:103/vm-103-disk-0.raw,size=3500G
swap: 2048
unprivileged: 1
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.idmap: u 0 100000 65536
lxc.idmap: g 0 100000 44
lxc.idmap: g 44 44 1
lxc.idmap: g 45 100045 62
lxc.idmap: g 107 993 1
lxc.idmap: g 108 100108 65428
 
That looks better. It doesn't use vfio-pci now but it has no driver. Please try to install the driver on the node now and reboot.
Ideally it should then say Kernel driver in use: nvidia and nvidia-smi should work.
Ultimatively I have to look at all the data again, it's just that in this case I was mostly interested in the Kernel driver in use line and that was missing :)
 
Last edited:
hey hey, i installed the driver and toolkit on the node through the link you sent. now it says the following:

Bash:
# lspci -vnnk | awk '/VGA/{print $0}' RS=
nvidia-smi
pct config 103
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] [10de:2486] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Device [174b:a632]
        Flags: bus master, fast devsel, latency 0, IRQ 98, IOMMU group 12
        Memory at f5000000 (32-bit, non-prefetchable) [size=16M]
        Memory at fa00000000 (64-bit, prefetchable) [size=8G]
        Memory at fc00000000 (64-bit, prefetchable) [size=32M]
        I/O ports at f000 [size=128]
        Expansion ROM at f6000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, IntMsgNum 0
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [258] L1 PM Substates
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] Secondary PCI Express
        Capabilities: [bb0] Physical Resizable BAR
        Capabilities: [c1c] Physical Layer 16.0 GT/s <?>
        Capabilities: [d00] Lane Margining at the Receiver
        Capabilities: [e00] Data Link Feature <?>
        Kernel modules: nvidiafb, nouveau, nova_core
07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c4) (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device [1043:8877]
        Flags: bus master, fast devsel, latency 0, IRQ 81, IOMMU group 15
        Memory at fc10000000 (64-bit, prefetchable) [size=256M]
        Memory at fc20000000 (64-bit, prefetchable) [size=2M]
        I/O ports at d000 [size=256]
        Memory at f6500000 (32-bit, non-prefetchable) [size=512K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, IntMsgNum 0
        Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=4 Masked-
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [270] Secondary PCI Express
        Capabilities: [2a0] Access Control Services
        Capabilities: [2b0] Address Translation Service (ATS)
        Capabilities: [2c0] Page Request Interface (PRI)
        Capabilities: [2d0] Process Address Space ID (PASID)
        Capabilities: [410] Physical Layer 16.0 GT/s <?>
        Capabilities: [450] Lane Margining at the Receiver
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

arch: amd64
cores: 6
features: nesting=1
hostname: media
memory: 4096
mp0: /mnt/usbstick,mp=/mnt/usbstick
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.2.254,hwaddr=BC:24:11:33:11:54,ip=192.168.2.207/24,type=veth
onboot: 1
ostype: debian
rootfs: media:103/vm-103-disk-0.raw,size=3500G
swap: 2048
unprivileged: 1
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.idmap: u 0 100000 65536
lxc.idmap: g 0 100000 44
lxc.idmap: g 44 44 1
lxc.idmap: g 45 100045 62
lxc.idmap: g 107 993 1
lxc.idmap: g 108 100108 65428
 
Weird. Let's look deeper. Please share
Bash:
modprobe nvidia
modinfo nvidia
dkms status
uname -a
dpkg -l "*nvidia*"
journalctl -b0 -g "nvidia"
Do you use Secure Boot? Did you install the debian package before? Remove it and its dependencies first.
 
no i don't use Secure Boot. I did have the package but before installing it your way i removed it with
Bash:
apt remove nvidia*
so it removes everything even related to nvidia and its drivers.

and the result to your commands:
Bash:
# modprobe nvidia
modinfo nvidia
dkms status
uname -a
dpkg -l "*nvidia*"
journalctl -b0 -g "nvidia"
modprobe: FATAL: Module nvidia not found in directory /lib/modules/7.0.12-1-pve
modinfo: ERROR: Module nvidia not found.
nvidia/595.84, 7.0.2-6-pve, x86_64: installed
Linux melloserver 7.0.12-1-pve #1 SMP PREEMPT_DYNAMIC PMX 7.0.12-1 (2026-06-09T21:07Z) x86_64 GNU/Linux
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                   Version      Architecture Description
+++-======================================-============-============-============================================================>
un  bumblebee-nvidia                       <none>       <none>       (no description available)
ii  firmware-nvidia-gsp                    550.163.01-2 amd64        NVIDIA GSP firmware
un  firmware-nvidia-gsp-550.163.01         <none>       <none>       (no description available)
rc  glx-alternative-nvidia                 1.2.2        amd64        allows the selection of NVIDIA as GLX provider
un  libegl1-glvnd-nvidia                   <none>       <none>       (no description available)
un  libgl1-glvnd-nvidia-glx                <none>       <none>       (no description available)
un  libgldispatch0-nvidia                  <none>       <none>       (no description available)
un  libgles1-glvnd-nvidia                  <none>       <none>       (no description available)
un  libgles2-glvnd-nvidia                  <none>       <none>       (no description available)
un  libglvnd0-nvidia                       <none>       <none>       (no description available)
un  libglx0-glvnd-nvidia                   <none>       <none>       (no description available)
ii  libnvidia-allocator1:amd64             550.163.01-2 amd64        NVIDIA allocator runtime library
un  libnvidia-cfg.so.1                     <none>       <none>       (no description available)
un  libnvidia-cfg1                         <none>       <none>       (no description available)
un  libnvidia-cfg1-any                     <none>       <none>       (no description available)
ii  libnvidia-container-tools              1.19.1-1     amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64             1.19.1-1     amd64        NVIDIA container runtime library
ii  libnvidia-egl-gbm1:amd64               1.1.2.1-1    amd64        GBM EGL external platform library for NVIDIA
ii  libnvidia-egl-wayland1:amd64           1:1.1.18-1   amd64        Wayland EGL External Platform library -- shared library
ii  libnvidia-eglcore:amd64                550.163.01-2 amd64        NVIDIA binary EGL core libraries
un  libnvidia-eglcore-550.163.01           <none>       <none>       (no description available)
un  libnvidia-gl-390                       <none>       <none>       (no description available)
un  libnvidia-gl-410                       <none>       <none>       (no description available)
ii  libnvidia-glcore:amd64                 550.163.01-2 amd64        NVIDIA binary OpenGL/GLX core libraries
un  libnvidia-glcore-550.163.01            <none>       <none>       (no description available)
ii  libnvidia-glvkspirv:amd64              550.163.01-2 amd64        NVIDIA binary Vulkan Spir-V compiler library
un  libnvidia-glvkspirv-550.163.01         <none>       <none>       (no description available)
ii  libnvidia-gpucomp:amd64                550.163.01-2 amd64        NVIDIA binary GPU compiler library
un  libnvidia-gpucomp-550.163.01           <none>       <none>       (no description available)
un  libnvidia-legacy-390xx-egl-wayland1    <none>       <none>       (no description available)
ii  libnvidia-ptxjitcompiler1:amd64        550.163.01-2 amd64        NVIDIA PTX JIT Compiler library
ii  libnvidia-rtcore:amd64                 550.163.01-2 amd64        NVIDIA binary Vulkan ray tracing (rtcore) library
un  libnvidia-rtcore-550.163.01            <none>       <none>       (no description available)
un  libnvidia-tesla-535-cfg1               <none>       <none>       (no description available)
un  libopengl0-glvnd-nvidia                <none>       <none>       (no description available)
rc  nvidia-alternative                     550.163.01-2 amd64        allows the selection of NVIDIA as GLX provider
un  nvidia-alternative--kmod-alias         <none>       <none>       (no description available)
un  nvidia-alternative-550.163.01          <none>       <none>       (no description available)
un  nvidia-alternative-any                 <none>       <none>       (no description available)
un  nvidia-alternative-legacy-173xx        <none>       <none>       (no description available)
un  nvidia-alternative-legacy-71xx         <none>       <none>       (no description available)
un  nvidia-alternative-legacy-96xx         <none>       <none>       (no description available)
un  nvidia-container-runtime               <none>       <none>       (no description available)
un  nvidia-container-runtime-hook          <none>       <none>       (no description available)
ii  nvidia-container-toolkit               1.19.1-1     amd64        NVIDIA Container toolkit
ii  nvidia-container-toolkit-base          1.19.1-1     amd64        NVIDIA Container Toolkit Base
un  nvidia-current                         <none>       <none>       (no description available)
un  nvidia-current-updates                 <none>       <none>       (no description available)
un  nvidia-driver                          <none>       <none>       (no description available)
un  nvidia-driver-any                      <none>       <none>       (no description available)
un  nvidia-driver-binary                   <none>       <none>       (no description available)
un  nvidia-egl-wayland-common              <none>       <none>       (no description available)
rc  nvidia-installer-cleanup               20240109+1   amd64        cleanup after driver installation with the nvidia-installer
un  nvidia-kernel-550.163.01               <none>       <none>       (no description available)
rc  nvidia-kernel-common                   20240109+1   amd64        NVIDIA binary kernel module support files
un  nvidia-kernel-dkms                     <none>       <none>       (no description available)
un  nvidia-kernel-source                   <none>       <none>       (no description available)
rc  nvidia-kernel-support                  550.163.01-2 amd64        NVIDIA binary kernel module support files
un  nvidia-kernel-support--v1              <none>       <none>       (no description available)
un  nvidia-kernel-support-any              <none>       <none>       (no description available)
un  nvidia-legacy-304xx-alternative        <none>       <none>       (no description available)
un  nvidia-legacy-304xx-driver             <none>       <none>       (no description available)
un  nvidia-legacy-304xx-kernel-support     <none>       <none>       (no description available)
un  nvidia-legacy-340xx-alternative        <none>       <none>       (no description available)
un  nvidia-legacy-340xx-kernel-support     <none>       <none>       (no description available)
un  nvidia-legacy-390xx-alternative        <none>       <none>       (no description available)
un  nvidia-legacy-390xx-kernel-support     <none>       <none>       (no description available)
rc  nvidia-legacy-check                    550.163.01-2 amd64        check for NVIDIA GPUs requiring a legacy driver
un  nvidia-modprobe                        <none>       <none>       (no description available)
un  nvidia-open-kernel-550.163.01          <none>       <none>       (no description available)
un  nvidia-open-kernel-source              <none>       <none>       (no description available)
rc  nvidia-persistenced                    550.163.01-1 amd64        daemon to maintain persistent software state in the NVIDIA d>
un  nvidia-settings                        <none>       <none>       (no description available)
rc  nvidia-support                         20240109+1   amd64        NVIDIA binary graphics driver support files
rc  nvidia-suspend-common                  550.163.01-2 amd64        NVIDIA driver - systemd power management scripts
un  nvidia-tesla-418-alternative           <none>       <none>       (no description available)
un  nvidia-tesla-418-kernel-dkms           <none>       <none>       (no description available)
un  nvidia-tesla-418-kernel-support        <none>       <none>       (no description available)
un  nvidia-tesla-440-kernel-support        <none>       <none>       (no description available)
un  nvidia-tesla-450-alternative           <none>       <none>       (no description available)
un  nvidia-tesla-450-kernel-support        <none>       <none>       (no description available)
un  nvidia-tesla-460-alternative           <none>       <none>       (no description available)
un  nvidia-tesla-460-kernel-dkms           <none>       <none>       (no description available)
un  nvidia-tesla-460-kernel-support        <none>       <none>       (no description available)
un  nvidia-tesla-470-alternative           <none>       <none>       (no description available)
un  nvidia-tesla-470-kernel-support        <none>       <none>       (no description available)
un  nvidia-tesla-510-alternative           <none>       <none>       (no description available)
un  nvidia-tesla-535-kernel-dkms           <none>       <none>       (no description available)
un  nvidia-tesla-alternative               <none>       <none>       (no description available)
un  nvidia-vdpau-driver                    <none>       <none>       (no description available)
un  nvidia-vulkan-icd                      <none>       <none>       (no description available)
ii  pve-nvidia-vgpu-helper                 0.3.1        all          Proxmox Nvidia vGPU helper script and systemd service
rc  xserver-xorg-video-nvidia              550.163.01-2 amd64        NVIDIA binary Xorg driver
un  xserver-xorg-video-nvidia-any          <none>       <none>       (no description available)
un  xserver-xorg-video-nvidia-legacy-304xx <none>       <none>       (no description available)
Jun 22 08:24:14 melloserver systemd-modules-load[506]: Failed to find module 'nvidia'
Jun 22 08:24:14 melloserver systemd-modules-load[506]: Failed to find module 'nvidia_uvm'
Jun 22 08:24:14 melloserver kernel: NovaCore 0000:01:00.0: NVIDIA (Chipset: GA104, Architecture: Ampere, Revision: a.1)
Jun 22 08:24:14 melloserver kernel: NovaCore 0000:01:00.0: Direct firmware load for nvidia/ga104/gsp/gsp-570.144.bin failed with >
Jun 22 08:24:14 melloserver kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/>
Jun 22 08:24:14 melloserver kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/>
Jun 22 08:24:14 melloserver kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/>
Jun 22 08:24:14 melloserver kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.1/0000:01:00.1/sound/card0/>
Jun 22 08:24:46 melloserver systemd[1]: Started nvidia-cdi-refresh.path - Trigger CDI refresh on NVIDIA driver or toolkit install>
Jun 22 08:24:46 melloserver systemd[1]: Starting nvidia-cdi-refresh.service - Refresh NVIDIA CDI specification file...
Jun 22 08:24:46 melloserver systemd[1]: Starting nvidia-persistenced.service - NVIDIA Persistence Daemon...
Jun 22 08:24:46 melloserver systemd[1]: nvidia-cdi-refresh.service: Skipped due to 'exec-condition'.
Jun 22 08:24:46 melloserver systemd[1]: Condition check resulted in nvidia-cdi-refresh.service - Refresh NVIDIA CDI specification>
Jun 22 08:24:46 melloserver nvidia-persistenced[1087]: Failed to query NVIDIA devices. Please ensure that the NVIDIA device files>
Jun 22 08:24:46 melloserver nvidia-persistenced[1074]: nvidia-persistenced failed to initialize. Check syslog for more details.
Jun 22 08:24:46 melloserver systemd[1]: nvidia-persistenced.service: Control process exited, code=exited, status=1/FAILURE
Jun 22 08:24:46 melloserver systemd[1]: nvidia-persistenced.service: Failed with result 'exit-code'.
Jun 22 08:24:46 melloserver systemd[1]: Failed to start nvidia-persistenced.service - NVIDIA Persistence Daemon.