Major number for nvidia-uvm changes over reboot

hlab

New Member
Mar 26, 2023
22
1
3
Hi,
I've LXCs working with `nvidia-uvm` passed through but if reboot host, the major number changes.
Is there way to keep major number same over reboot
OR
way to figure out and update container config on reboot.
 
Do a `ls -l /dev/nvidia-uvm` on the host. It shows major and minor device number right after the GID column.
In the lxc-config I have e.g.
```
lxc.cgroup2.devices.allow = c 236:1 rwm
lxc.cgroup2.devices.allow = c 236:0 rwm
...
lxc.mount.entry = /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file 0 2
lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file 0 2
...
```
 
Last edited:
Do a `ls -l /dev/nvidia-uvm` on the host. It shows major and minor device number right after the GID column.
In the lxc-config I have e.g.
```
lxc.cgroup2.devices.allow = c 236:1 rwm
lxc.cgroup2.devices.allow = c 236:0 rwm
...
lxc.mount.entry = /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file 0 2
lxc.mount.entry = /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file 0 2
...
```
ok it finally happened again.
Yes I know how to find major and minor device number, however major number changes over PVE reboots.
for e.g.
BEFORE:
Bash:
root@pve:~# ls -l /dev/nvidia-uvm
crw-rw-rw- 1 root root 508, 0 Dec  4 15:03 /dev/nvidia-uvm
AFTER:
Bash:
root@pve:~# ls -l /dev/nvidia-uvm
crw-rw-rw- 1 root root 505, 0 Dec  4 15:03 /dev/nvidia-uvm

Is there way to keep major device number same or figure out at boot and update lxc config?
 
I also have this problem on 7.4, using cgroup2 to make my GPU available to some unprivileged LXCs. Did you ever make any progress?

It bites me just about every reboot. I'm wishing cgroup2 had an alternative to using device numbers, along the lines of identifying disks by their UUID. I don't see anything in the docs, but hope springs eternal.
 
I also have this problem on 7.4, using cgroup2 to make my GPU available to some unprivileged LXCs. Did you ever make any progress?

It bites me just about every reboot. I'm wishing cgroup2 had an alternative to using device numbers, along the lines of identifying disks by their UUID. I don't see anything in the docs, but hope springs eternal.
Nope I've just made a habit of checking all numbers after reboot / power cycle
 
For anyone coming on this later, this problem is solved in v8.1:

Proxmox 8.1 (and maybe 8.0?) has explicit device sharing by filename. My example, sharing my gpu, now looks like this in the lxc.conf. There are no longer any cgroup2.devices.allow or mount.entry elements required.

Code:
dev0: /dev/nvidia0
dev1: /dev/nvidiactl
dev2: /dev/nvidia-uvm
dev3: /dev/nvidia-uvm-tools
 
  • Like
Reactions: hlab

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!