PVE 9 - NVIDIA Container Toolkit Broken

It is in the no-subscription repo. I've set it up the same way as before, like in that tutorial I linked.

@dasunsrule32 Just to double check, with your approach you don't even need to install the driver (with --no-kernel-modules) in the lxc? That would be great. I wanted to do that somehow before, but didn't know how, so I settled on installing it on both host and lxc.
Nope, no need for driver installation. The drivers from the host get bind mounted to the container from the host for use. You can see I outputted the mount output there.

See my post further up: https://forum.proxmox.com/threads/pve-9-nvidia-container-toolkit-broken.169364/post-797860
 
Last edited:
Nope, no need for driver installation. The drivers from the host get bind mounted to the container from the host for use. You see I outputted the mount output there.

See my post further up: https://forum.proxmox.com/threads/pve-9-nvidia-container-toolkit-broken.169364/post-797860
Fantastic, I'll give that a try in a fresh container.

Btw. do you have a suggestion on how to clean up what was already installed in the existing ones (with --no-kernel-modules)? I don't want to risk moving everything, but I don't want to keep doing this double install when I update drivers either
 
Fantastic, I'll give that a try in a fresh container.

Btw. do you have a suggestion on how to clean up what was already installed in the existing ones (with --no-kernel-modules)? I don't want to risk moving everything, but I don't want to keep doing this double install when I update drivers either
Make a backup and do not use the nvidia hook until you uninstall the manually installed driver.

Run the installer and uninstall the drivers, then you "should" be able to use the bind mound options in the config, delete the card permissions as well. You don't need to handle the card permissions either, as the nvidia hook handles all that for you automatically.

However, my recommendation would be to spin up a new container in case something gets leftover.
 
Fantastic, I'll give that a try in a fresh container.

Btw. do you have a suggestion on how to clean up what was already installed in the existing ones (with --no-kernel-modules)? I don't want to risk moving everything, but I don't want to keep doing this double install when I update drivers either
Also, the hook will only work with unprivileged containers. So you'd need to continue the driver route if are using privileged containers.
 
Also, the hook will only work with unprivileged containers. So you'd need to continue the driver route if are using privileged containers.
Unfortunately I'm using privileged. I guess spinning up a new one is the best way to go. The problem is that I'm running a bunch of docker containers in some of them, so need to be super careful to port everything without loosing something valuable. I'll give it a try
 
Unfortunately I'm using privileged. I guess spinning up a new one is the best way to go. The problem is that I'm running a bunch of docker containers in some of them, so need to be super careful to port everything without loosing something valuable. I'll give it a try
You don't need privileged for Docker. My config above actually will work with Docker as well. Only thing you really need to do is just map permissions to the correct mapped permissions, root for example: 100000:100000 if you're doing bind mounts to your data on the local host.

Example of some data I have:
Code:
drwxr-xr-x - 103568 103568  8 Sep 11:09  apps
drwxr-xr-x - root   adm     2 Sep  2024  logos
drwxrwxr-x - root   adm     8 Sep 13:56  scripts
drwxr-xr-x - 100000 100000 22 Sep 16:07  stacks
 
Last edited:
Yeah, I know. I started with privileged since it was easier to do the passthrough, or at least I thought so. Now I'm stuck with those with a buch of stuff in them. I'll try porting everything to an unprivileged one
Even with the driver, privileged still isn't needed. You just have to set the card permissions on the /dev/nv* stuff in the container config. This is how we all learn though!

You got this!
 
@dasunsrule32 I don't appreciate the full quote but I briefly tested it and quite like it. Not sure why I didn't test it sooner. I'll switch over to that and document it if it works nicely. I'm just not much of a fan of manually editing the CT config. This would also solve the driver/library discrepancy between the node and CTs. Thanks for convincing me to try it!
Edit: It's now documented as well. Let me know if you have suggestions.
 
Last edited:
  • Like
Reactions: dasunsrule32