Docker LXC Unprivileged container on Proxmox 7 with ZFS

Probably yes, but this is not supported and you therefore you will not get much help for this here.

Ah didn't realize this - I see it is in the documentation, thanks for pointing it out.


If you want to run application containers, for example, Docker images, it is recommended that you run them inside a Proxmox Qemu VM. This will give you all the advantages of application containerization, while also providing the benefits that VMs offer, such as strong isolation from the host and the ability to live-migrate, which otherwise isn’t possible with containers.
 
Also very interested on this. We are running docker servers in VM as well as LXC on LVM with pve 6. We prefer lxc mainly for the bind mount options so unclear now if the new issues are due to pve7 or ZFS so it would be interesting to know before we upgrade.

I understand the long reluctant support of the pve team of docker in general and docker on lxc in particular, but I also do see many advantages on it as far as fs mounts and performance vs running it on a pure VM.
 
Migrated today from Proxmox 6.4 to 7.1-8. I wanted to migrade my Promxox root (Hardware Raid 1) to ZFS Software Raid1 Mirror, so I installed from ISO from scratch.

I have 3 Unprivileged LXCs with Docker nested and about 25 Containers.

In each of these LXCs, the Docker system directory /var/lib/docker point to ZFS Vols, formatted as XFS, as mountpoints in the LXC (I basically followed this, also in this ansible notebook).

Migration worked flawlessly.

Check storage driver in LXC:
Code:
docker info | grep -A 7 "Storage Driver:"
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2

I did not need to import from Backups, could directly start LXCs from the ZFS backed storage, as described here.

In summary, this was a really smooth ride from 6.4 to 7.1.

I decided for the ZFS Vol approach, because there is very little documentation on fuse-overlayfs and how reliable it is.
With XFS backed by ZFS, the results seemed more predictable and I need reliability for my docker containers (note that this is entirely subjective and I haven't tested fuse-overlayfs at all).

The two approaches can be exchanged, so I might test fuseoverlay-fs on a /var/lib/docker later..

Also note the Native Overlay Diff: false - I am not sure whether this is a default effect or bound to the specific approach that I used. In either case, it means that docker builds may be a little bit slower. If you use this approach for building many docker images (e.g. Gitlab CI/Gitlab Runners etc.), then you may look into this further - in my case, I use Docker for stable services, so nothing to mind.

Lastly, I migrated my LXCs from Debian to Ubuntu, since there is a discussion about some Kernel options that Debian did not activate by default, which could affect the Docker service. I haven't tested or compared if the same approach would work with Debian LXCs, but I will do this and report.

Just to give you some examples, here's a list of Docker Services that I host (stable for 2 years) within proxmox unprivileged LXCs:
  • Gitlab CE
  • Funkwhale
  • Iris/Mopidy/Mosquitto/Snapcast
  • Invidious
  • Grafana
  • Miniflux, RSS-Bridge
  • Postgres
  • Mailcow Dockerized
  • ...
I am not suggesting that this is good for enterprise context, but works perfectly fine in a private or freelance situation.
 
Last edited:
  • Like
Reactions: Fahmula and luison
Migrated today from Proxmox 6.4 to 7.1-8. I wanted to migrade my Promxox root (Hardware Raid 1) to ZFS Software Raid1 Mirror, so I installed from ISO from scratch.

I have 3 Unprivileged LXCs with Docker nested and about 25 Containers.

In each of these LXCs, the Docker system directory /var/lib/docker point to ZFS Vols, formatted as XFS, as mountpoints in the LXC (I basically followed this, also in this ansible notebook).

Migration worked flawlessly.

Check storage driver in LXC:
Code:
docker info | grep -A 7 "Storage Driver:"
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2

I did not need to import from Backups, could directly start LXCs from the ZFS backed storage, as described here.

In summary, this was a really smooth ride from 6.4 to 7.1.

I decided for the ZFS Vol approach, because there is very little documentation on fuse-overlayfs and how reliable it is.
With XFS backed by ZFS, the results seemed more predictable and I need reliability for my docker containers.

The two approaches can be exchanged, so I might test fuseoverlay-fs on a /var/lib/docker later..

How would you backup /var/lib/docker since proxmox won't allow backups of a mount point like that?
 
This is a good question: Firstly, you have ZFS Snapshots - if /var/lib/docker hold important information, then create a Snapshot
alongside a LXC backup (or, do both with ZFS Snapshot, the LXC Vol and the /var/lib/docker VOL). If you need files (e.g. for archiving backups), you can send these Snapshots to compressed archives, too.

However, what I would recommend is to design your docker stack so that you don't need to backup var/lib/docker at all.

If you avoid Docker Volumes (and replace everything with Host Mounts on the LXC, e.g. in the docker-compose.yml, then the /var/lib/docker folder will only hold Linux base distros that need no backup. I did this in my case and what happened after starting containers with docker-compose up -d with an empty /var/lib/docker is that the base images get automatically pulled, if the layers do not exist. All the customzations reside in the host mounts in the LXC. This has the benefit that you're not backing up Linux distros that simply blow up your files, however, it requires a bit of organization.
 
Last edited:
  • Like
Reactions: Fahmula
This is a good question: Firstly, you have ZFS Snapshots - if /var/lib/docker hold important information, then create a Snapshot
alongside a LXC backup (or, do both with ZFS Snapshot, the LXC Vol and the /var/lib/docker VOL). If you need files (e.g. for archiving backups), you can send these Snapshots to compressed archives, too.

However, what I would recommend is to design your docker stack so that you don't need to backup var/lib/docker at all.

If you avoid Docker Volumes (and replace everything with Host Mounts on the LXC, e.g. in the docker-compose.yml, then the /var/lib/docker folder will only hold Linux base distros that need no backup. I did this in my case and what happened after starting containers with docker-compose up -d with an empty /var/lib/docker is that the base images get automatically pulled, if the layers do not exist. All the customzations reside in the host mounts in the LXC. This has the benefit that you're not backing up Linux distros that simply blow up your files, however, it requires a bit of organization.

I was considering the ZFS Snapshots but I don't have another pool to send them to. I'm pretty new to ZFS so I have a lot more to learn.

This options seems interesting. I'm not sure if I understand clearly though. You're saying to store each docker container volumes outside of /var/lib/docker/volumes Vol but within the LXC Vol so the important data gets backup but not the images that can be easily be replaced? Or store them on host mounts outside of the LXC all together.
 
You're saying to store each docker container volumes outside of /var/lib/docker/volumes Vol but within the LXC Vol so the important data gets backup but not the images that can be easily be replaced?

This. An example docker-compose.yml that stays in /srv/miniflux/docker-compose.yml in the unpriviliged LXC. I usually create users for each docker-service, e.g.:

Code:
sudo useradd -r -s /sbin/nologin -m -d /srv/miniflux -U -G docker miniflux
sudo -u miniflux -H bash

The docker-compose.yml with a volume looks like this:
Code:
version: '3'

services:
  miniflux:
    ...
  db:
    image: postgres:13
    restart: unless-stopped
    environment:
      - POSTGRES_USER=${DB_USER:-miniflux}
      - POSTGRES_PASSWORD=${DB_SECRET:-eX4mP13p455w0Rd}
    volumes:
      - miniflux-db:/var/lib/postgresql/data
    networks:
      - miniflux
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "${DB_USER:-miniflux}"]
      interval: 10s
      start_period: 30s

volumes:
  miniflux-db:

Instead, you want to migrate the miniflux-db to a host mount in the LXC, e.g. to /srv/miniflux/data/miniflux-db.

Make a backup the Volume first as tar:
Code:
export VOLUME_NAME=flux_miniflux-db
export BACKUP_PATH=/srv/miniflux/backup
docker run --rm --volume $VOLUME_NAME:/data \
    -v $BACKUP_PATH:/backup ubuntu \
    tar -zcvf /backup/$VOLUME_NAME.tar /data

Extract to the LXC host:
Code:
tar -xvf /srv/miniflux/backup/flux_miniflux-db.tar -C /srv/miniflux/data/miniflux-db --strip 1

Now, change the docker-compose.yml accordingly:

Code:
version: '3'

services:
  miniflux:
    ...
  db:
    image: postgres:13
    restart: unless-stopped
    environment:
      ...
    volumes:
      - /srv/miniflux/data/miniflux-db:/var/lib/postgresql/data
    networks:
      - miniflux
    healthcheck:
      ...

networks:
  miniflux:
    name: ${NETWORK_NAME:-miniflux-network}

This would be similar for other persistent data, e.g. modified config files etc., they are all mounted from the data directory, e.g.
Code:
- /srv/miniflux/data/config.cfg:/var/config.cfg
.. this will back up important data with the standard LXC backups.

Restart docker:
Code:
docker-compose up -d && docker-compose logs --follow --tail 100

.. and remove the docker volume (cleanup), after testing:
Code:
docker volume prune
>Are you sure you want to continue? [y/N] y
>Deleted Volumes:
>flux_miniflux-db
>
>Total reclaimed space: 183.9MB

Regarding ZFS Snapshots: You need no other pool to send snapshots to. Snapshots are a feature of the ZFS file system to go back to any stage, you can have multiple snapshots and they consume usually very little space. Snapshots can be send/exported to *.tar files, if you need those for archiving purposes. Anyway, you need no Snapshots of the /var/lib/docker VOL if you have no docker volumes (or other persistent data stored in there, which is usually the case for correctly configured containers).
 
Last edited:
  • Like
Reactions: Fahmula
This. An example docker-compose.yml that stays in /srv/miniflux/docker-compose.yml in the unpriviliged LXC. I usually create users for each docker-service, e.g.:

Code:
sudo useradd -r -s /sbin/nologin -m -d /srv/miniflux -U -G docker miniflux

The docker-compose.yml with a volume looks like this:
Code:
version: '3'

services:
  miniflux:
    ...
  db:
    image: postgres:13
    restart: unless-stopped
    environment:
      - POSTGRES_USER=${DB_USER:-miniflux}
      - POSTGRES_PASSWORD=${DB_SECRET:-eX4mP13p455w0Rd}
    volumes:
      - miniflux-db:/var/lib/postgresql/data
    networks:
      - miniflux
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "${DB_USER:-miniflux}"]
      interval: 10s
      start_period: 30s

volumes:
  miniflux-db:

Instead, you want to migrate the miniflux-db to a host mount in the LXC, e.g. to /srv/miniflux/data/miniflux-db.

Make a backup the Volume first as tar:
Code:
export VOLUME_NAME=flux_miniflux-db
export BACKUP_PATH=/srv/miniflux/backup
docker run --rm --volume $VOLUME_NAME:/data \
    -v $BACKUP_PATH:/backup ubuntu \
    tar -zcvf /backup/$VOLUME_NAME.tar /data

Extract to the LXC host:
Code:
tar -xvf /srv/miniflux/backup/flux_miniflux-db.tar -C /srv/miniflux/data/miniflux-db --strip 1

Now, change the docker-compose.yml accordingly:

Code:
version: '3'

services:
  miniflux:
    ...
  db:
    image: postgres:13
    restart: unless-stopped
    environment:
      ...
    volumes:
      - /srv/miniflux/data/miniflux-db:/var/lib/postgresql/data
    networks:
      - miniflux
    healthcheck:
      ...

networks:
  miniflux:
    name: ${NETWORK_NAME:-miniflux-network}

This would be similar for other persistent data, e.g. modified config files etc., they are all mounted from the data directory, e.g.
Code:
- /srv/miniflux/data/config.cfg:/var/config.cfg
.. this will back up important data with the standard LXC backups.

Restart docker:
Code:
docker-compose up -d && docker-compose logs --follow --tail 100

.. and remove the docker volume (cleanup), after testing:
Code:
docker volume prune
>Are you sure you want to continue? [y/N] y
>Deleted Volumes:
>flux_miniflux-db
>
>Total reclaimed space: 183.9MB

Regarding ZFS Snapshots: You need no other pool to send snapshots to. Snapshots are a feature of the ZFS file system to go back to any stage, you can have multiple snapshots and they consume usually very little space. Snapshots can be send/exported to *.tar files, if you need those for archiving purposes. Anyway, you need no Snapshots of the /var/lib/docker VOL if you have no docker volumes.

Thanks for explaining! That's exactly what I had in mind. I got a bit confuse on your post before because when you say "host" I thought you meant proxmox.
 
Last edited:
  • Like
Reactions: Helmut101
Migrated today from Proxmox 6.4 to 7.1-8. I wanted to migrade my Promxox root (Hardware Raid 1) to ZFS Software Raid1 Mirror, so I installed from ISO from scratch.

I have 3 Unprivileged LXCs with Docker nested and about 25 Containers.

In each of these LXCs, the Docker system directory /var/lib/docker point to ZFS Vols, formatted as XFS, as mountpoints in the LXC (I basically followed this, also in this ansible notebook).

Migration worked flawlessly.

Check storage driver in LXC:
Code:
docker info | grep -A 7 "Storage Driver:"
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2

I did not need to import from Backups, could directly start LXCs from the ZFS backed storage, as described here.

In summary, this was a really smooth ride from 6.4 to 7.1.

I decided for the ZFS Vol approach, because there is very little documentation on fuse-overlayfs and how reliable it is.
With XFS backed by ZFS, the results seemed more predictable and I need reliability for my docker containers (note that this is entirely subjective and I haven't tested fuse-overlayfs at all).

The two approaches can be exchanged, so I might test fuseoverlay-fs on a /var/lib/docker later..

Also note the Native Overlay Diff: false - I am not sure whether this is a default effect or bound to the specific approach that I used. In either case, it means that docker builds may be a little bit slower. If you use this approach for building many docker images (e.g. Gitlab CI/Gitlab Runners etc.), then you may look into this further - in my case, I use Docker for stable services, so nothing to mind.

Lastly, I migrated my LXCs from Debian to Ubuntu, since there is a discussion about some Kernel options that Debian did not activate by default, which could affect the Docker service. I haven't tested or compared if the same approach would work with Debian LXCs, but I will do this and report.

Just to give you some examples, here's a list of Docker Services that I host (stable for 2 years) within proxmox unprivileged LXCs:
  • Gitlab CE
  • Funkwhale
  • Iris/Mopidy/Mosquitto/Snapcast
  • Invidious
  • Grafana
  • Miniflux, RSS-Bridge
  • Postgres
  • Mailcow Dockerized
  • ...
I am not suggesting that this is good for enterprise context, but works perfectly fine in a private or freelance situation.your

Hey thanks for sharing this. I tried all the various ways (https://www.reddit.com/r/Proxmox/comments/lsrt28/easy_way_to_run_docker_in_an_unprivileged_lxc_on/ and https://du.nkel.dev/blog/2021-03-25_proxmox_docker/ and others posts) to run docker in a lxc but actually I'm wondering why warning messages are shown on proxmox host:

Code:
Feb 14 23:32:22 pve pveproxy[904357]: got inotify poll request in wrong process - disabling inotify
Feb 14 23:33:26 pve pvedaemon[827609]: <root@pam> successful auth for user 'root@pam'
Feb 14 23:41:04 pve kernel: overlayfs: fs on '/var/lib/docker/overlay2/l/RBJK6MFV54PEIPCOKTKPYGQ6H6' does not support file handles, falling back to xino=off.

I created the zfs volume for the docker lxc, formatted it (tried both ext4 and xfs) and them mounted to a directory setting permissions on files and directories.. docker successfully installed and running but that warning message appears in the proxmox host and I don't understand, why?!

In the docker lxc, docker info shows that overlay2 is used.
 
Last edited:
I think this is related to performance improvements used by Docker to speed up the use of docker layers in docker build and docker commit commands, which does not seem to work in this specific case with XFS on ZFS (see here, also the docs). At the same time, you should see the following warnings in dmesg on the hypervisor:

Code:
overlayfs: upper fs does not support RENAME_WHITEOUT.
overlayfs: upper fs is missing required features.

You also see this here in the LXC docker service report:
Code:
docker info | grep -A 7 "Storage Driver:"
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: false <--
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2

If you are not building Docker images very frequently (such as in a CI, gitlab runner etc.), then this should be no problem. It is a performance feature. I see the same messages and no problem so far. Might be worth testing different ways to set up the /var/lib/docker fs, but since it is working atm, this was not high up on my list. Also, this solution might be worth a look (also reported here).
 
Last edited:
I have update the blog post about Docker and unprivileged LXCs:

https://du.nkel.dev/blog/2021-03-25_proxmox_docker/

Tested:
- in Proxmox 5.4, 6.4 and 7.1-10
- with Ubuntu (21.10) and Debian (11) LXC Templates
- using Directory Storage and ZFS formatted volume mount

All of this is working very well. There were rumors (that probably came form a comment on this repo) that only Ubuntu LXC is supported. I verified and Debian works equally (and why should it not..).

I have checked fuse-overlayfs, but the whole setup seems a little bit too flaky, so I am sticking with a XFS formatted ZFS volume. This has worked fine for half a year now since migrating to ZFS. However, note that I haven't actually tested fuse-overlayfs - so any reports welcome.
 
Last edited:
  • Like
Reactions: vesalius and lucad
The name of the package that you need to install in your Proxmox host is "fuse-overlayfs"... yeah, the name is kinda obvious but who knows. Installing the package will remove "fuse" and replace it with "fuse3", but according to this thread, it shouldn't cause any issues!
<<
dpkg: fuse: dependency problems, but removing anyway as you requested:
pve-cluster depends on fuse.
glusterfs-client depends on fuse.
ceph-fuse depends on fuse.
>>
it doesn't look good.
And it ends up with a dead Proxmox host :-(

The good part is that it was running on a test machine...
 
You are sure this needs to be installed on the Proxmox host? This guide suggests otherwise.
On my Proxmox 6.4 host the guide worked perfectly for me with debian 11 lxc and zfs. I didn't run anything on the proxmox host. I'm also running docker from the official docker repo, no need to install the debian specific version noted in that post.
 
  • Like
Reactions: Helmut101
Gosh, why is this all so hard? :) LXC will not get widespread adoption with so many pain points OR until there is a simple way to convert docker to lxc (there are instructions available..but it's full of errors and issues).

@Helmut101 your blog post is super helpful thank you! For me however, using the overlay2 & ZFS approach:
  • is too fiddly (requires pve host cli manual zpool intervention and maintenance on each proxmox node)
  • means you lose native proxmox snapshot functionality
  • means you lose native proxmox backup functionality for docker data
I went down the fuse-overlayfs route. I tried following this page https://c-goes.github.io/posts/proxmox-lxc-docker-fuse-overlayfs/ but ran into issues (at least on Debian 11 lxc), so wanted to document/share them here for others:
  1. Installing fuse-overlayfs following the instructions on here fuse-overlayfs fails on the 4th instruction with "crun": executable file not found in $PATH.
    1. The command
      buildah bud -v $PWD:/build/fuse-overlayfs -t fuse-overlayfs -f ./Containerfile.static.ubuntu . should be
      buildah bud -v $PWD:/build/fuse-overlayfs --runtime /usr/sbin/runc -t fuse-overlayfs -f ./Containerfile.static.ubuntu .
  2. Dont install docker.io
    1. When starting containers you get the error cgroup2: procHooks: failed to load program: operation not permitted. I am not a smart man, and googling this error hurt my head trying to figure it out.
    2. Instead use the standard Debian instructions as detailed on Helmut101's blog post. If you use docker.io you get errors launching containers
    3. The standard Deb docker should pick up /usr/bin/fuse-overlayfs automatically and use this as the standard storage driver.
 
  • Like
Reactions: Helmut101
Just had an issue with Proxmox snapshot / docker / fuse. When doing a snapshot I got an error about not being able to read /var/docker....../fuse or something. I then rolled back, and although the image was rolled back the Proxmox rollback UI did not reflect which was the current state.

Tried it again and it seems okay but a bit worrying.
 
Just had an issue with Proxmox snapshot / docker / fuse. When doing a snapshot I got an error about not being able to read /var/docker....../fuse or something. I then rolled back, and although the image was rolled back the Proxmox rollback UI did not reflect which was the current state.

Tried it again and it seems okay but a bit worrying.
Hi timdonovan, thanks for getting back and the kudos. Testing overlay-fs (and adding results to the blog post) is on my list. Currently not super important since I am fine with my setup, but seeing your results pushes my interest. Will try this soon and give an update. Your issue with snapshots sounds strange, but good to know ahead.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!