[TUTORIAL] GPU accelerated Graphical Remote Desktop with support for nested containers and flatpaks in unprivileged LXC containers

Ozymandias42

New Member
Oct 24, 2025
5
1
3
This guide explains how to get GPU accelerated applications and full desktop-remoting with unprivileged containers working.
It further explains how to get nested containers to run. This not only includes things like docker, podman or kubernetes but also flatpaks as these too -by their use of namespaces and cgroups- are containers.


GPU acceleration

The first and easiest to solve hurdle is GPU acceleration.
For this the UI can be used. Since PVE 9 one simply has to add /dev/dri/cardX and /dev/dri/renderD12Y with X and Y corresponding to the GPU to be used.
With single GPU systems this is usually card0 and renderD128, however in multi GPU systems (or on special hardware like the Raspi) it can also be card1 or renderD127.
The Raspi for example splits the GPU into card0 and card1 in addition to renderD128.

Now the dialogue has the option to supply the container with a user and group id (uid and gid).
The reason for this option is to make the (unprivileged) container aware of how to map the hosts uid/gid for the given device-node onto a corresponding id inside the container.
This is vital for direct rendering interface (dri) devices, as these can only be accessed by root or by applications -or users- with the video or render groups.

This also means however, that the uid for the card0 and renderD128 device nodes can be ignored here as it is usually just 0 for root.
The gid however is vital.
So for this to work one needs to know what the gid of video and render is inside the container.
This is easy to find however, one just needs to look into /etc/group.

Special tip for AMD cards or for doing this on containers on a Raspi, as this applies to using LXC there too:
For AMD cards one can also add /dev/kfd, as this is required for ROCm.
For Raspis, if the v4l2video module is loaded and therefore /dev/video{10..30} exists, these can also be passed through.

The /dev/video{10..30} device nodes (one suffices) can -for example- be used for video-transcoding in Jellyfin.
(This is required there as modern Jellyfin verisons no longer implement the Raspi specific mmal en/decoder and use the Vulkan implementation on the platform agnostic /dev/videoX interfaces.
On other platforms this is done via the cardX interfaces via VA-API, Intel QuickSync or VDPAU.)

GPU accelerated graphical remoting

This can be very easily accomplished via waypipe simply install it on both the container and the device meant to connect from and you're golden.
Invocation is something like this: waypipe -c none ssh <user>@<containerFQDN-or-ip>

This produces basically an SSH session and acts the same as ssh -X or ssh -Y did in Xorg/X11 times, only that now with wayland this is naturally GPU accelerated without hacky GLX indirection tricks like VirtualGL.

So to start a desktop session from there one simply needs to run something like: dbus-run-session -- startplasma-wayland
That's it.

NOTE: This will have neither shared clipboard nor audio.
These can be done manually though via use of the wl-clipboard package on both ends as well as networked audio sinks via pipewire.
Clipboard sharing can then be done via something like: wl-paste | ssh <user>@<lxc> wl-copy
Or potentially in the future via remote sessions via the sddm based new native plasma display manager or for GNOME users GDM.

An alternative to this variant can be to run the wayland compositor of choice on top of Xorg in conjunction with xrdp. There are many ways.


Nested cgroups, podman and docker

Now onto the exiting and more difficult part.
One would think that simply providing the nesting feature flag is enough for this to work as according to the docs it "[...] expose(s) procfs and sysfs contents of the host to the guest".

A look into /var/lib/lxc/<containerid>/config however will show the following relevant line:
Code:
lxc.mount.auto = sys:mixed

And when trying to actually run a container like a flatpak the error will say something about not being able to write to /proc/sys/user/max_cgroup_namespaces
The solution therefore is to add the following line to /etc/pve/lxc/<container-id>.conf
Code:
lxc.mount.auto: sys:mixed proc:rw
This will -upon generation of the config file under /var/lib/lxc/<containerID> at container start, modify the line accordingly.

Now nested cgroups will work.


Docker, Podman in rootless mode and kubernetes

If one wants to run docker, then in accordance with the docs, the keyctl feature flag needs to be supplied as well, as docker is special insofar, as it needs to be able to use this syscall.
What this feature flag does under the hood is to add the following line to /var/lib/lxc/<container-id>/rules.seccomp
Code:
keyctl errno 38

Podman does not need this feature flag. However to run podman in rootless mode will throw the following error unless additional changes are made:

Code:
ERRO[0000] running `/usr/bin/newuidmap 1068 0 1000 1 1 100000 65536`: newuidmap: write to uid_map failed: Operation not permitted
Error: cannot set up namespace using "/usr/bin/newuidmap": exit status 1

The reason for this error is that the configured default subuid- and subgid-maps under /etc/sub{g,u}id cannot be applied to the system due to having only 65536 ids as configured host-side.

To check the available range inside a container or elsewhere one can run cat /proc/self/uid_map

Now there are two ways of fixing this. The first is to do it from the host side as outlined here:
https://forum.proxmox.com/threads/podman-in-rootless-mode-on-lxc-container.141790/post-661678

In this variant the assigned id-range on the host-side is extended by the default offset inside the guest. So by another 100.000.
Meaning with the host as reference the UID space of the LXC guest is offset by 100.000 so that root there equals 100.000 on the host and then for unprivileged podman containers inside the LXC container it is offset by 100.000 again, meaning root in the podman container is 100.000 in the LXC and 200.000 on the host.

To do this the content of /etc/sub{u,g}id is changed from root:100000:65536 to root:100000:165536
Then this change also needs to be propagated to /var/lib/lxc/<container-id>/config where the default corresponding lines are:
Code:
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536

So analogous to how nested cgroups were made to work, these lines are modified accordingly and added to /etc/pve/lxc/<container-id>.conf
Code:
lxc.idmap: u 0 100000 165536
lxc.idmap: g 0 100000 165536

The other way is to change /etc/sub{g,u}id on the container side to not exceed the allotted 65536 ids by changing both the offset and the range to something within the 65535 ids. For example: <user>:20000:10000 in this case the offset is 20000 and the range of supplied ids is 10000. Which is still firmly in the available 65536 ids.

Now the above error message will no longer appear. However a new one will appear complaining about not being able to assign a tun network interface.
For this a hostside change to /etc/pve/lxc/<container-id>.conf is required. It's the following:
Code:
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net dev/net none bind,create=dir

Finally Kubernetes:
Kubernetes -or at least the k3s distribution of it- has a special caveat of writing log-output to /dev/kmsg which does not exist in a container.
This however is trivial to fix by just creating a symlink under that path to /dev/console via ln -s /dev/kmsg /dev/console

Proxying Mountpoints from the Host
Mounts can be easily proxied from the host by mounting with -o uid=101000 on the host to have the UID match the uid-mapped first unprivileged user (uid 1000 on debian) inside the container. This works since uid-mapping simply shifts the uids by +100.000 for a range of 65536 uids as apparent from /etc/subuid via this line: root:100000:65536

To get the mointpoint into the container one just needs to add the following line to /etc/pve/lxc/<container-id>.conf
mp0: <mountpoint-host>,mp=<mountpoint-container>

Use cases
  • Using an AMD based Mini-PCs full GPU power for GPU tasks like Video Editing, panorama stitching, etc.
  • Running LocalAI with the Vulkan Backend (as it does not work with VirGL in VMs (without Google's Venus patches))
  • Playing games
  • Running a Kubernetes node or Docker in an unprivileged container. (Caveat: no native mounts of network shares, even if probably theoretically doable too with enough changes to copy the privileged containers nfs and smb mount feature flag)
 
Last edited: