PVE 9 - NVIDIA Container Toolkit Broken

rausche · Aug 7, 2025

Following the upgrade from PVE8 to PVE9, I can no longer start my LXCs which are utilizing the NVIDIA Container Toolkit.

I have a variety of containers that attach to the same GPU for things like transcoding, rendering, stable diffusion, LLMs, etc -- so passthrough\dedicated assignemnt is a nuisance, and the container toolkit provides an elegant method to share the GPU between containers seamless and with low administrative overhead.

Some details:

was previously on the optional 6.14 kernel on PVE8, no issues there
using NVIDIA Driver 570.181, upgraded from 570.172.08 during troubleshooting
using NVIDIA Container Toolkit 1.18.0~rc.2, upgraded from 1.17.8 during troubleshooting

I utilize the following lines in LXC config to hook the toolkit:

lxc.hook.pre-start: sh -c '[ ! -f /dev/nvidia-uvm ] && /usr/bin/nvidia-modprobe -c0 -u'

lxc.environment: NVIDIA_VISIBLE_DEVICES=all

lxc.environment: NVIDIA_DRIVER_CAPABILITIES=compute,utility,video

lxc.hook.mount: /usr/share/lxc/hooks/nvidia

In my LXC debug logs during container startup I see entries like the following:

DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia 117 lxc mount produced output: + exec nvidia-container-cli --user configure --no-cgroups --ldconfig=@/usr/sbin/ldconfig --device=all --compute --utility --video /usr/lib/x86_64-linux-gnu/lxc/rootfs
DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia 117 lxc mount produced output: nvidia-container-cli: mount error: open failed: /usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/1/ns/mnt: permission denied

I notice that the hook script is calling '--no-cgroups' , since this is an unprivileged container, and this seems to be the cause of my problems. However, since the NVIDIA Container Toolkit apparently supports cgroups for a long time, I attempted modifying the hook script to remove that argument -- which causes the following debug line to appear:

DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia 117 lxc mount produced output: + exec nvidia-container-cli --user configure --ldconfig=@/usr/sbin/ldconfig --device=all --compute --utility --video /usr/lib/x86_64-linux-gnu/lxc/rootfs
DEBUG utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia 117 lxc mount produced output: nvidia-container-cli: container error: failed to get device cgroup mount path: relative path in mount prefix: /../../..

During troubleshooting I also tried running the container as a priviledged container, however the debug log reports back that the '/usr/share/lxc/hooks/nvidia' hook can only be run in the unpriviledged context.

I am at something of a loss here, and hoping not to have to revert back to PVE8. Is there a convienent way to embed appropriate cgroup2 support into the LXC to enable usability with the NVIDIA Container Toolkit?

PXSH133 · Aug 9, 2025

Same issue:

Code:

lxc-start <contnum> 20250809114632.390 DEBUG    cgfsng - ../src/lxc/cgroups/cgfsng.c:__cgroupfs_mount:2197 - Mounted cgroup filesystem cgroup2 onto 19((null))
lxc-start <contnum> 20250809114632.390 INFO     utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxcfs/lxc.mount.hook" for container "<contnum>", config section "lxc"
lxc-start <contnum> 20250809114632.411 INFO     utils - ../src/lxc/utils.c:run_script_argv:587 - Executing script "/usr/share/lxc/hooks/nvidia" for container "<contnum>", config section "lxc"
lxc-start <contnum> 20250809114632.424 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia <contnum> lxc mount produced output: mkdir: cannot create directory ‘/var/lib/lxc/<contnum>/hook’
lxc-start <contnum> 20250809114632.424 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia <contnum> lxc mount produced output: : Permission denied
lxc-start <contnum> 20250809114632.428 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia <contnum> lxc mount produced output: + exec nvidia-container-cli --user configure --no-cgroups --ldconfig=@/usr/sbin/ldconfig --device=all --compute --compat32 --display --graphics --utility --video /usr/lib/x86_64-linux-gnu/lxc/rootfs
lxc-start <contnum> 20250809114632.448 DEBUG    utils - ../src/lxc/utils.c:run_buffer:560 - Script exec /usr/share/lxc/hooks/nvidia <contnum> lxc mount produced output: nvidia-container-cli: mount error: open failed: /usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/1/ns/mnt: permission denied

Might be more of NVIDIA Container Toolkit issue (compatibility with PVE 9 LXC version or so ?) - but useful for other Proxmox users to know they should not update to PVE9 if they rely on nvidia-container-toolkit for now.

rausche · Aug 9, 2025

Might be more of NVIDIA Container Toolkit issue (compatibility with PVE 9 LXC version or so ?)

PVE 9 moves from LXC 6.0.0 to LXC 6.0.4, which released in April 2025 -- so I think NVIDIA has had time to build in support for this LXC version and/or for reports to be made around compatability issues... but I don't find any similar issues when researching these debug log entries, other than this issue which is somewhat similar. However the workaround for that issue is no longer valid for PVE 9.

I think PVE 9's implementation of LXC, with the full removal of cgroupv1 mounts, is now 'botched' for the container toolkit because:

container toolkit hook script is somehow deciding to launch with --no-cgroups flag; not sure what the logic is for this in the script
with --no-cgroups flag, the hook script (and container itself?) is not allowed to interact the /proc fs in rw before pivot_root occurs
without --no-cgroups flag, the hook script cannot determine the path to the temporary /proc fs, since the container is passing a relative path instead of an absolute path

Additionally using native LXC arguments like lxc.namespace.clone=proc sys mnt or lxc.mount.auto=proc:rw sys:rw cgroup-full:rw either doesn't work or appears to break other PVE-related components required to launch the container successfully.

But, all said, I am fairly new to debugging LXC here and I'm somewhat grasping at straws. I feel as though the answer is probably somewhere in the LXC container documentation, but I'm not actually sure this is resolvable without modifying PVE's own container launching logic and parameters.

PXSH133 · Aug 17, 2025

The issue might be at least partly related to apparmor. It looks like migration to Debian trixie implied migration to apparmor 4, which in turn brought new defaults and configuration files changes that Proxmox has some trouble fully catching up with (see for instance bug "apparmor problem mqueue in LXC", forum thread "Proxmox VE 9.0 BETA LCX Docker not working").

Here are the system logs for apparmor when trying to start the container with default apparmor configuration and with nvidia containers toolkit hooks. I must emphasize that container is unprivileged.

Code:

[X.677367] audit: type=1400 audit(X.806:319): apparmor="DENIED" operation="getattr" class="posix_mqueue" profile="/usr/bin/lxc-start" name="/" pid=1656100 comm="vgs" requested="getattr" denied="getattr"class="posix_mqueue" fsuid=0 ouid=0
[X.719834] audit: type=1400 audit(X.849:320): apparmor="DENIED" operation="getattr" class="posix_mqueue" profile="/usr/bin/lxc-start" name="/" pid=1656101 comm="lvs" requested="getattr" denied="getattr"class="posix_mqueue" fsuid=0 ouid=0
[X.932507] audit: type=1400 audit(X.061:321): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-108_</var/lib/lxc>" pid=1656118 comm="apparmor_parser"
[X.530805] audit: type=1400 audit(X.660:322): apparmor="STATUS" operation="profile_remove" profile="/usr/bin/lxc-start" name="lxc-108_</var/lib/lxc>" pid=1656282 comm="apparmor_parser"

I tried to implement workaround about mqueue as advised in bug "apparmor problem mqueue in LXC", but this did not change anything, and is likely useless as the patch has been implemented and distributed already, also this might apply to privileged containers only.

I swear I could have the container starting correctly (but with nvidia hooks not necessarily working though, I did not check) when I changed the container config to COMPLETELY DISABLE apparmor, doing lxc.apparmor.profile = unconfined (as described in Proxmox wiki "Linux_Container" page). However this is not reliably reproductible, so on top of apparmor issue, there might be race conditions or conflicts with what nvidia hooks script is actually doing.

dasunsrule32 · Aug 19, 2025

Running into something similar, my post here describes this. Have you figured out how to get this working?

dasunsrule32 · Aug 19, 2025

PXSH133 said:
The issue might be at least partly related to apparmor. It looks like migration to Debian trixie implied migration to apparmor 4, which in turn brought new defaults and configuration files changes that Proxmox has some trouble fully catching up with (see for instance bug "apparmor problem mqueue in LXC", forum thread "Proxmox VE 9.0 BETA LCX Docker not working").

Here are the system logs for apparmor when trying to start the container with default apparmor configuration and with nvidia containers toolkit hooks. I must emphasize that container is unprivileged.

Code:

[X.677367] audit: type=1400 audit(X.806:319): apparmor="DENIED" operation="getattr" class="posix_mqueue" profile="/usr/bin/lxc-start" name="/" pid=1656100 comm="vgs" requested="getattr" denied="getattr"class="posix_mqueue" fsuid=0 ouid=0 [X.719834] audit: type=1400 audit(X.849:320): apparmor="DENIED" operation="getattr" class="posix_mqueue" profile="/usr/bin/lxc-start" name="/" pid=1656101 comm="lvs" requested="getattr" denied="getattr"class="posix_mqueue" fsuid=0 ouid=0 [X.932507] audit: type=1400 audit(X.061:321): apparmor="STATUS" operation="profile_load" profile="/usr/bin/lxc-start" name="lxc-108_</var/lib/lxc>" pid=1656118 comm="apparmor_parser" [X.530805] audit: type=1400 audit(X.660:322): apparmor="STATUS" operation="profile_remove" profile="/usr/bin/lxc-start" name="lxc-108_</var/lib/lxc>" pid=1656282 comm="apparmor_parser"

I tried to implement workaround about mqueue as advised in bug "apparmor problem mqueue in LXC", but this did not change anything, and is likely useless as the patch has been implemented and distributed already, also this might apply to privileged containers only.

I swear I could have the container starting correctly (but with nvidia hooks not necessarily working though, I did not check) when I changed the container config to COMPLETELY DISABLE apparmor, doing lxc.apparmor.profile = unconfined (as described in Proxmox wiki "Linux_Container" page). However this is not reliably reproductible, so on top of apparmor issue, there might be race conditions or conflicts with what nvidia hooks script is actually doing.

On that beta thread, this stuck out to me. I wonder if updating to a trixie lxc image would do the trick?

FYI, I can see that in a Debian bookworm based container, but it seems alright in a Debian Trixie based one, at least an ls /dev/mqueue works there.

dasunsrule32 · Aug 19, 2025

I spun up a d13 instance and it failed there as well. The hook needs to be updated I guess.

dasunsrule32 · Aug 25, 2025

I was working with the Incus project owner on their forums and it looks like this is a regression in 6.0.4 which was fixed in 6.0.5.

I'm going to open a bug report with Proxmox.

Thread: https://discuss.linuxcontainers.org/t/lxc-nvidia-container-toolkit/24563/4?u=dasunsrule32
6.0.5 release thread: https://discuss.linuxcontainers.org/t/lxc-6-0-5-lts-has-been-released/24438

dasunsrule32 · Aug 25, 2025

Bug report filed: https://bugzilla.proxmox.com/show_bug.cgi?id=6725

rausche · Aug 26, 2025

dasunsrule32 said:
I was working with the Incus project owner on their forums and it looks like this is a regression in 6.0.4 which was fixed in 6.0.5.

I'm going to open a bug report with Proxmox.

Thread: https://discuss.linuxcontainers.org/t/lxc-nvidia-container-toolkit/24563/4?u=dasunsrule32
6.0.5 release thread: https://discuss.linuxcontainers.org/t/lxc-6-0-5-lts-has-been-released/24438

This does sound promising.

dasunsrule32 said:
Bug report filed: https://bugzilla.proxmox.com/show_bug.cgi?id=6725

Keep us posted. I'll be happy to help test any resolution or workaround offered.

dasunsrule32 · Aug 27, 2025

rausche said:
This does sound promising.

Keep us posted. I'll be happy to help test any resolution or workaround offered.

You should be able to subscribe to the bug report if you wanted to.

dasunsrule32 · Aug 28, 2025

rausche said:
This does sound promising.

Keep us posted. I'll be happy to help test any resolution or workaround offered.

Since Proxmox maintains their own version of lxc, they should be able to import the patch. They are building lxc from trixie though. So I doubt we'll get 6.0.5, I could be wrong though.

https://github.com/proxmox/lxc/blob/801762fa4a6afe32c55d87c5eec99e14a9eaa48d/Makefile

Code:

apt show lxc-pve
Package: lxc-pve
Version: 6.0.4-2
Priority: optional
Section: admin
Maintainer: Proxmox Support Team <support@proxmox.com>
Installed-Size: 4,015 kB
Provides: liblxc1, lxc
Depends: apparmor, bridge-utils, criu (>= 1.5.2-1), libcap2 (>= 1:2.10), lxcfs, python3, uidmap, libapparmor1 (>= 2.6~devel), libc6 (>= 2.38), libdbus-1-3 >
Conflicts: liblxc1, lxc
Breaks: pve-container (<< 3.1-1)
Replaces: liblxc1, lxc
Homepage: https://linuxcontainers.org
Download-Size: 1,129 kB
APT-Manual-Installed: yes
APT-Sources: http://download.proxmox.com/debian/pve trixie/pve-no-subscription amd64 Packages
Description: Linux containers userspace tools
 Containers provides resource management through control groups and
 resource isolation through namespaces. The linux containers, lxc, aims
 to use these new functionalities to provide an userspace container
 object which provides full resource isolation and resource control for
 an applications or a system.

dasunsrule32 · Sep 2, 2025

For those not following the bug report, they are pushing 6.0.5 to pve-test soon. So it should be available for testing soon.

dasunsrule32 · Sep 3, 2025

Confirmed working on 6.0.5

https://bugzilla.proxmox.com/show_bug.cgi?id=6725#c8

Code:

Config for testing:

arch: amd64
features: fuse=1,keyctl=1,mknod=1,nesting=1
hostname: gpu-test
memory: 64223
nameserver: 192.168.0.8
net0: name=eth0,bridge=vmbr5,gw=192.168.5.1,hwaddr=BC:XX:XX:XX:XX:XX,ip=192.168.5.7/24,type=veth
onboot: 1
ostype: debian
rootfs: pve-containers:subvol-109-disk-0,size=0T
swap: 512
unprivileged: 1
lxc.mount.entry: /pool/cloud-restore mnt/cloud-restore none rbind,rw,create=dir 0 0
lxc.mount.entry: /pool/containers mnt/containers none rbind,rw,create=dir 0 0
lxc.mount.entry: /pool/data/apps mnt/data none rbind,rw,create=dir 0 0
lxc.mount.entry: /pool/database mnt/db none rbind,rw,create=dir 0 0
lxc.mount.entry: /pool/media mnt/media none rbind,rw,create=dir 0 0
lxc.mount.entry: /pool/data/stacks opt/stacks none rbind,rw,create=dir 0 0
lxc.hook.pre-start: sh -c '[ ! -f /dev/nvidia0 ] && /usr/bin/nvidia-modprobe -c0 -u'
lxc.environment: NVIDIA_VISIBLE_DEVICES=all
lxc.environment: NVIDIA_DRIVER_CAPABILITIES=compute,graphics,utility,video
lxc.hook.mount: /usr/share/lxc/hooks/nvidia


root@gpu-test:~# nvidia-smi
Wed Sep  3 14:41:52 2025      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti     Off |   00000000:2B:00.0 Off |                  N/A |
|  0%   35C    P8             11W /  280W |       5MiB /  11264MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                       
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+


Libs:

root@gpu-test:~# mount|grep nv
tmpfs on /proc/driver/nvidia type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=555,uid=100000,gid=100000,inode64)
tmpfs on /etc/nvidia/nvidia-application-profiles-rc.d type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=555,uid=100000,gid=100000,inode64)
/dev/sdb3 on /usr/bin/nvidia-smi type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/bin/nvidia-debugdump type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/bin/nvidia-persistenced type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-gpucomp.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvcuvid.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.550.163.01 type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/firmware/nvidia/550.163.01/gsp_ga10x.bin type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
/dev/sdb3 on /usr/lib/firmware/nvidia/550.163.01/gsp_tu10x.bin type btrfs (ro,nosuid,nodev,relatime,degraded,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=257,subvol=/@)
tmpfs on /run/nvidia-persistenced/socket type tmpfs (rw,nosuid,nodev,noexec,relatime,size=6576496k,mode=755,inode64)
udev on /dev/nvidiactl type devtmpfs (ro,nosuid,noexec,relatime,size=32844688k,nr_inodes=8211172,mode=755,inode64)
udev on /dev/nvidia-uvm type devtmpfs (ro,nosuid,noexec,relatime,size=32844688k,nr_inodes=8211172,mode=755,inode64)
udev on /dev/nvidia-uvm-tools type devtmpfs (ro,nosuid,noexec,relatime,size=32844688k,nr_inodes=8211172,mode=755,inode64)
udev on /dev/nvidia0 type devtmpfs (ro,nosuid,noexec,relatime,size=32844688k,nr_inodes=8211172,mode=755,inode64)
proc on /proc/driver/nvidia/gpus/0000:2b:00.0 type proc (ro,nosuid,nodev,noexec,relatime)


Transcoding working:

❯ nvidia-smi
Wed Sep  3 10:48:18 2025      
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.163.01             Driver Version: 550.163.01     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti     Off |   00000000:2B:00.0 Off |                  N/A |
|  0%   42C    P2            102W /  280W |     633MiB /  11264MiB |     14%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                       
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   3150011      C   ...lib/plexmediaserver/Plex Transcoder        628MiB |
+-----------------------------------------------------------------------------------------+

rausche · Sep 10, 2025

dasunsrule32 said:
Confirmed working on 6.0.5

Also working for me after pulling 6.0.5 from pve-test.

Thanks @dasunsrule32 for coordinating the extra effort on this, much appreciated.

zolakt · 2025-09-28T20:47:45+0200

Has this been released in the main repo, or still only in pve-test?

I'm holding on with the v9 upgrade until I know what to do with this.

Currently, I'm getting this notice from pve8to9:

NOTICE: found unusual suites that are neither old 'bookworm' nor new 'trixie':
found suite / at in /etc/apt/sources.list.d/nvidia-container-toolkit.list:1
Please ensure these repositories are shipping compatible packages for the upgrade!
NOTICE: found no suite mismatches, but found at least one strange suite

I'm far from an expert in this. Can someone please tell me what I need to do to safely update to v9, and have nvidia working in lxcs.

Btw. on v8 I did it in privileged lxc, similar to this tutorial. Will the same work on v9?

Search

Search

PVE 9 - NVIDIA Container Toolkit Broken

rausche

New Member

PXSH133

Member

rausche

New Member

PXSH133

Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

rausche

New Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

dasunsrule32

Renowned Member

rausche

New Member

zolakt

New Member

We value your privacy